DeepLC CPU vs GPU performance

markmipt commented 3 years ago

Hello,

First of all, thanks for outstanding software for RT prediction!

I'm actively using DeepLC in my workflow and now it becomes a bottleneck in the total analysis time. So, I'm thinking to switch from CPU-based DeepLC calculations to GPU. But I want to estimate effect of this change before buying a powerful video card for the server. Have you done any comparison for CPU vs GPU DeepLC time consumption?

Regards, Mark

RobbinBouwmeester commented 3 years ago

Hi Mark,

Great that you are using DeepLC, hope it offers everything you need.

In terms of buying a GPU. I can run some small scale tests with a 1080 TI vs 2080 vs 2x 2080 vs CPU. Do keep in mind that for predictions I think there is a lot of other overhead in comparison to training (I do not expect the same performance boost when compared to training).

Currently, do you enable multi-threading? And can you indicate what the run time (lets say per 1k peptides) is? Also if accuracy is not the main issue, you can always select less models. I could also train a smaller model where prediction times should be faster at the cost of accuracy.

Hope that helps!

Kind regards,

Robbin

markmipt commented 3 years ago

Hi Robbin,

Thanks for the detailed answers! I was also worried about the overheads in prediction and that was basically the main reason why I've asked about tests. I would be happy with even single 1080 TI vs CPU test.

I run DeepLC using command line with basic arguments which are --file_pred, --file_cal and --file_pred_out. So, I didn't enable multi-threading in any special way, but I clearly see that DeepLC utilizes all processor cores/threads.

Typically, "--file_pred" contains 200-400k peptides and "--file_cal" contains 1-2k peptides. It takes ~7-15 min to predict RT for all peptides (~2 seconds per 1k peptides). DeepLC reports 2ms/sample.

The accuracy is critical for my application. Thus, simpler/smaller model is not an option. As for the less number of models, I'm not sure if I understand correctly how DeepLC works. I suppose that it takes all default pre-trained models (4 of them) and checks which of them fits the calibration set better. In my case when calibration set is much smaller than total set for prediction (2k peptides vs 200k) - there should be almost no time difference for any number of used models. Am I right here?

Regards, Mark

RobbinBouwmeester commented 3 years ago

Hi Mark,

Indeed, it sounds like you are already using multi-threading. And indeed those are the speeds I expect.... If accuracy is critical you should not use less models. In a normal run DeepLC wil both run all models for the calibration purposes you mentioned, but after calibration selection it also uses the average (ensemble-like) predictions of several models (if they can be part of the ensemble). So it will run everything for the averaging.

Now that I think of it, I might have a solution (also something that might be very useful for our internal use of DeepLC). We can index predictions before calibration, that would mean super-fast predictions (that would not even be achievable with multiple GPUs ;)). Will try to have this feature implemented before this Wednesday.

Thanks for making me think about this! Might provide DeepLC with a killer feature.

Kind regards,

Robbin

RobbinBouwmeester commented 3 years ago

Hi Mark,

DeepLC has been updated with a library feature, it wil soon be available on pip. In order to run the library feature make sure you add the following command line arguments:

--use_library location/to/library --write_library

Please let me know if you have any trouble running the new code.

Kind regards,

Robbin

RobbinBouwmeester commented 3 years ago

The library feature should now be fully functional, if there are any problems please reopen this issue.

compomics / DeepLC

DeepLC CPU vs GPU performance #20