run time very long - Githubissues

qisun2 commented 2 years ago

I tried deeplasmid on a 100-cpu core node. I used the provided Docker image (billandreo/deeplasmid), and followed your instructions using the command feature_DL_plasmid_predict.sh. I tested on 5 different Ecoli genomes. I can get the results back, but It takes 1-2 hours per genome. I monitor the CPU usage, all 100 cpu cores are used. I also tried on a GPU node, it does not seem that the software was using the GPU. In your paper, the runtime is 2 min per genome on a 5 16-core node cluster. I am trying to figure out what I did wrong. I cannot find the download link to the test genome 649989979.fna you mentioned in the instructions.

wandreopoulos commented 2 years ago

Hi Qi Sun, The 649989979.fna is available here. The runtime depends on the number of contigs in the file. Can you please try this file? https://github.com/wandreopoulos/deeplasmid/tree/master/classifier/dl/testing/649989979 I will be glad to test your input E.coli files if you could provide them. Thanks, Bill

On Fri, Feb 4, 2022 at 8:08 AM QI SUN @.***> wrote:

I tried deeplasmid on a 100-cpu core node. I used the provided Docker image (billandreo/deeplasmid), and followed your instructions using the command feature_DL_plasmid_predict.sh. I tested on 5 different Ecoli genomes. I can get the results back, but It takes 1-2 hours per genome. I monitor the CPU usage, all 100 cpu cores are used. I also tried on a GPU node, it does not seem that the software was using the GPU. In your paper, the runtime is 2 min per genome on a 5 16-core node cluster. I am trying to figure out what I did wrong. I cannot find the download link to the test genome 649989979.fna you mentioned in the instructions.

— Reply to this email directly, view it on GitHub https://github.com/wandreopoulos/deeplasmid/issues/1, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANGW5IHSBW2MBYJRUFX2V3UZP2V7ANCNFSM5NSFUZ6A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you are subscribed to this thread.Message ID: @.***>

-- Thanks, Bill

William B. Andreopoulos, Ph.D. Joint Genome Institute LBNL

qisun2 commented 2 years ago

I see. The runtime depends on contig number. My assemblies are fragmented, each with hundreds of contigs.

wandreopoulos commented 2 years ago

Also the Docker image I provided was built and tested on a MacBook. I wonder if you have access to a MacBook to run it there. If you can provide the E.coli fastas I could run it for you. I remember I was running the native code on a compute cluster instead of the Docker image, since the Docker image on a compute cluster behaved strangely indeed.

On Tue, Feb 8, 2022 at 6:40 AM QI SUN @.***> wrote:

I see. The runtime depends on contig number. My assemblies are fragmented, each with hundreds of contigs.

— Reply to this email directly, view it on GitHub https://github.com/wandreopoulos/deeplasmid/issues/1#issuecomment-1032681959, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANGW5OV7YRWANP62WWIBODU2ETM3ANCNFSM5NSFUZ6A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you commented.Message ID: @.***>

-- Thanks, Bill

William B. Andreopoulos, Ph.D. Joint Genome Institute LBNL

wandreopoulos commented 2 years ago

Please note earlier this month I released the deeplasmid-gpu docker image, which runs on GPU. I have updated the README.md file with instructions for running the deeplasmid-gpu image.

I have sent you the results on your e.Coli assemblies via email. Thanks, Bill

On Fri, Feb 4, 2022 at 8:08 AM QI SUN @.***> wrote:

I tried deeplasmid on a 100-cpu core node. I used the provided Docker image (billandreo/deeplasmid), and followed your instructions using the command feature_DL_plasmid_predict.sh. I tested on 5 different Ecoli genomes. I can get the results back, but It takes 1-2 hours per genome. I monitor the CPU usage, all 100 cpu cores are used. I also tried on a GPU node, it does not seem that the software was using the GPU. In your paper, the runtime is 2 min per genome on a 5 16-core node cluster. I am trying to figure out what I did wrong. I cannot find the download link to the test genome 649989979.fna you mentioned in the instructions.

— Reply to this email directly, view it on GitHub https://github.com/wandreopoulos/deeplasmid/issues/1, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANGW5IHSBW2MBYJRUFX2V3UZP2V7ANCNFSM5NSFUZ6A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you are subscribed to this thread.Message ID: @.***>

-- Thanks, Bill

William B. Andreopoulos, Ph.D. Joint Genome Institute LBNL

wandreopoulos commented 1 year ago

Hello Qi Sun @qisun2 ,

I just released a new Docker image for GPUs, that is built on tensorflow/tensorflow:latest-gpu. I left the CNTK stuff behind and moved to TensorFlow.

As described in the README, the usage is much simpler. For example: docker pull billandreo/deeplasmid.tf.gpu2 sudo /usr/bin/docker run -it -v /path/to/fasta:/srv/jgi-ml/classifier/dl/in.fasta -v /path/to/OUT/dir:/srv/jgi-ml/classifier/dl/outdir billandreo/deeplasmid.tf.gpu2 deeplasmid.sh in.fasta outdir

If you need help with some other microbial assemblies, feel free to ask me.

qisun2 commented 1 year ago

Thanks! Qi

From: wandreopoulos @.> Sent: Thursday, December 22, 2022 5:55 PM To: wandreopoulos/deeplasmid @.> Cc: Qi Sun @.>; Author @.> Subject: Re: [wandreopoulos/deeplasmid] run time very long (Issue #1)

Hello Qi Sun,

I just released a new Docker image for GPUs, that is built on tensorflow/tensorflow:latest-gpu. I left the CNTK stuff behind and moved to TensorFlow.

As described in the README, the usage is much simpler. For example: docker pull billandreo/deeplasmid.tf.gpu2 sudo /usr/bin/docker run -it -v /path/to/fasta:/srv/jgi-ml/classifier/dl/in.fasta -v /path/to/OUT/dir:/srv/jgi-ml/classifier/dl/outdir billandreo/deeplasmid.tf.gpu2 deeplasmid.sh in.fasta outdir

If you need help with some other microbial assemblies, feel free to ask me.

— Reply to this email directly, view it on GitHubhttps://github.com/wandreopoulos/deeplasmid/issues/1#issuecomment-1363417706, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AJYOGOJBCYKB4CHYOL5DC5LWOTL2VANCNFSM5NSFUZ6A. You are receiving this because you authored the thread.Message ID: @.**@.>>

qisun2 commented 1 year ago

Bill,

I am doing another round of deeplasmid. I have 3000-4000 assemblies, each assembly has 2.4 mega-bases and about 50-200 contigs.

I tried it on my GPU node. I can finish 10 assemblies per hour. As I have two GPU unit on the server, I do two jobs at a time. I checked nvidia-smi results, each GPU is used at 40% capacity. I should be able to finish everything in two weeks.

Do you think that is reasonable speed? Any suggestions to make it faster?

Thanks!

Qi

From: wandreopoulos @.> Sent: Thursday, December 22, 2022 5:55 PM To: wandreopoulos/deeplasmid @.> Cc: Qi Sun @.>; Author @.> Subject: Re: [wandreopoulos/deeplasmid] run time very long (Issue #1)

Hello Qi Sun,

I just released a new Docker image for GPUs, that is built on tensorflow/tensorflow:latest-gpu. I left the CNTK stuff behind and moved to TensorFlow.

As described in the README, the usage is much simpler. For example: docker pull billandreo/deeplasmid.tf.gpu2 sudo /usr/bin/docker run -it -v /path/to/fasta:/srv/jgi-ml/classifier/dl/in.fasta -v /path/to/OUT/dir:/srv/jgi-ml/classifier/dl/outdir billandreo/deeplasmid.tf.gpu2 deeplasmid.sh in.fasta outdir

If you need help with some other microbial assemblies, feel free to ask me.

— Reply to this email directly, view it on GitHubhttps://github.com/wandreopoulos/deeplasmid/issues/1#issuecomment-1363417706, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AJYOGOJBCYKB4CHYOL5DC5LWOTL2VANCNFSM5NSFUZ6A. You are receiving this because you authored the thread.Message ID: @.**@.>>

wandreopoulos / deeplasmid

run time very long #1