havakv / pycox

Survival analysis with PyTorch
BSD 2-Clause "Simplified" License
780 stars 180 forks source link

Training Time and CPU Usage #142

Open xiangtgao opened 1 year ago

xiangtgao commented 1 year ago

Hello, Thank you so much for the excellent package on survival analysis.

I am using the DeepHit model, and I observed a weird thing: I have a dataset with input dimension over 200, and I can see it uses multiple CPUs at the same time when training. Then, I selected a subset of the training features which now I only have an input dimension of 6, but now the model is 3 times slower compared to the one with way more features, and I can see this time it does not uses multiple CPUs. I'm assuming it's about the number of workers in the fit function, but changing it does not affect anything. Do you have any insight about this?

havakv commented 1 year ago

The number of workers set here only affect the data pipeline. So if you have a lot of demanding pre-processing of the data before passing it to the net, multiple workers can be used for this. I think the tensor computations in the network use all available cores as default.