Genome-Bioinformatics-RadboudUMC / DeNovoCNN

A deep learning approach to de novo variant calling in next generation sequencing data
GNU General Public License v3.0
12 stars 2 forks source link

computing requirements for DenovoCNN #13

Open sophienguyen01 opened 6 months ago

sophienguyen01 commented 6 months ago

Hi,

I tried to run DenovoCNN on a 32 CPU cores environment, using Ashkenazi samples but the task has been run for more than 8 hours and just got terminated for no reason. I want to ask what the recommendations for CPU cores to run DenovoCNN for large samples. On the paper it said only 16 CPU cores is enough and I already tried but it did not work either.

Thanks

gelana commented 5 months ago

Hi @sophienguyen01,

Thanks for trying out the tool!

Would that be possible for you to provide the full output of this run?

For a 50x WGS trio using 16 cores, the runtime was ~6 hours. Generally, the more cores are available, the faster DeNovoCNN should work.
If you have provided VCF files as input, then DeNovoCNN should have generated variants_list.txt in your working directory, which is a list of variants to check for de novo status. If you could let me know how many variants are in there, I might be able to estimate how long it should normally take on 32 CPUs machine.

sophienguyen01 commented 5 months ago

Thank for your reply,

I also upgraded my server to 48 cores, and the job failed due to exceeding time.

This is the last line of DenovoCNN job output: W tensorflow/core/common_runtime/process_function_library_runtime.cc:773] Ignoring multi-device function optimization failure: Deadline exceeded: meta_optimizer exceeded deadline.

Where can I fix it?

sophienguyen01 commented 5 months ago

The job did output variant list file but it failed when it come to the deeplearning job. There is 159213 variants in my list

gelana commented 5 months ago

@sophienguyen01 with this amount of variants it should take around 6 hours with 32 CPUs and even faster with 48.

I am not familiar with the error that you have provided, but I have a feeling that your machine might have a GPU ? The application of the model is done in parallel using multiprocessing on CPUs and was tested without GPU. So in case the model is trying to use GPU and multiprocessing at the same time, there might be some problems related to that.