Set GX_NUM_CORES - Githubissues

MicheleSonny commented 1 year ago

Hello everybody, I set the cpu to 48 and checked with the following command: docker run --env GX_NUM_CORES ncbi/fcs-gx env | grep GX_NUM which returns GX_NUM_CORES=48. However, when I run _runfcsgx.py it doesn't run in parallel and the parsing is slow. Could you help me to solve the parallelize problem to speed up the parsing?

Thank you Ubuntu 20.04 lts, 252GB of RAM, 500 swap disks, database fcs all

Best Regards Michele

etvedte commented 1 year ago

Hi Michele,

Did you try running run_fcsgx.py on the test dataset/database? How long does that take to run?

when I run run_fcsgx.py it doesn't run in parallel and the parsing is slow

How are you assessing this? What parsing are you referring to?

What kinds of genomes are you running? What GX step(s) are you referring to? For example, the initial retrieval of the GX 'all' database tends to be slower (60-90 minutes is normal) but then the actual pipeline tends to be fast (on the order of minutes, not hours). Subsequent GX runs with a downloaded GX database should also be fast.

etvedte commented 1 year ago

Hi Michele,

I hadn't noticed before, but you are trying to run GX on a host with insufficient RAM for optimal performance. We recommend hosts with at least 512 GB of RAM:

https://github.com/ncbi/fcs/wiki/FCS-GX#prerequisites

I don't have performance metrics on hand, but anecdotally we have tested with smaller RAM hosts and observed very long run times.

MicSonn commented 1 year ago

Hi etvedte, Thanks for the reply.

By htop I evaluate the number of active cpu e available RAM/Swap memory (256GiB/500GiB).

I'm running on a denovo assembly genome fasta of about 600MiB and 6000 contings before submitting to ncbi.

The initial database recovery (all) was fast.

The problem was partially solved by removing the ncbi/fcs-gx container and then running the run_fcsgx.py script directly, without pulling and running docker first and without setting the variable GX_NUM_CORES=48

However in this way all the resources of the local server have been occupied.

My server has the following resources: Ubuntu 20.04 LTS, 48 Intel xeon CPUs, 256 GB RAM and 500GB swap file on HD not SSD

Is there a possibility to reduce the number of dedicated CPUs and set the maximum memory to use? For example by editing the run_fcsgx.py script.

Thank you very much, Michele

etvedte commented 1 year ago

Hi Michele,

When running GX the database is being accessed randomly such that every other hit would be a page swap under your current configuration. That will run really, really slow.

Currently we are not pursuing adjustments to the code to adjust maximum memory. We need to modify the language in our wiki to reflect that 512 GB RAM is a technical requirement, not a recommended option. We recommend that users who cannot meet these requirements with local machines can run GX cheaply on a cloud service such as AWS or GCP, and the wiki provides documentation on how to get started with cloud based runs.

pstrope commented 1 year ago

Closing. Please follow-up if you have other questions.

MicheleSonny commented 1 year ago

Hello etvedte, sorry for the delay in closing the issue. I update concluding that to complete the FCS run correctly I had to set 1TB of RAM and 1TB HD in an AWS instance. Thanks, Michele

ncbi / fcs

Set GX_NUM_CORES #23