WrightonLabCSU / DRAM

Distilled and Refined Annotation of Metabolism: A tool for the annotation and curation of function for microbial and viral genomes
GNU General Public License v3.0
249 stars 52 forks source link

`DRAM-setup.py set_database_locations --update_description_db` process was Killed #224

Closed HushKuo closed 2 years ago

HushKuo commented 2 years ago

Hi,

Thank you for this wonderful tool.

I am facing an issue while running DRAM-v.py annotate as followed : KeyError: 'YP_009329195.1'. To solve this issue, I tried the DRAM-setup.py set_database_locations --update_description_db command to update description db. However, the process was unexpectedly interupted and Killed was shown. I tried many times but always failed with same Killed result. There doesn't seem to be anything else that interferes with the operation of the program. Therefore, I don't know how to continue my analysis with DRAM-v.

Can you please help me with this issue? It will be really helpful if you can guide me with this.

Thanks in advance.

rmFlynn commented 2 years ago

How much memory is available, and have you tried the --skip_uniref flag?

HushKuo commented 2 years ago

How much memory is available, and have you tried the --skip_uniref flag?

Thank you for your guidance!!!

Is updating description of db a memory-occupied step? It seems the memory is still available? :(

              total        used        free      shared  buff/cache   available
Mem:      791014988    15819248   387615928     2416820   387579812   770956488
Swap:       4194300     1802016     2392284

Additionally, it seems that the --skip_uniref flag can not match the DRAM-setup.py set_database_locations --update_description_db nor DRAM-setup.py update_description_db commands.

(dram) [user@localhost DRAM]$ DRAM-setup.py set_database_locations --update_description_db --skip_uniref
usage: DRAM-setup.py [-h] {version,prepare_databases,set_database_locations,update_description_db,update_dram_forms,print_config,import_config,export_config} ...
DRAM-setup.py: error: unrecognized arguments: --skip_uniref
(dram) [user@localhost DRAM]$ DRAM-setup.py update_description_db --skip_uniref
usage: DRAM-setup.py [-h] {version,prepare_databases,set_database_locations,update_description_db,update_dram_forms,print_config,import_config,export_config} ...
DRAM-setup.py: error: unrecognized arguments: --skip_uniref

How can I update db description with --skip_uniref? Could you please offer me the full command?

By the way, this issue originally happened when I am running DRAM-v.py annotate command, and then I refer to similar issues #86 #108 and finally figure out the solution to update_description_db. Is there anything else that may cause the KeyError: 'YP_009329195.1' to be reported?

This is the information reported when I run grep -a YP_009329195.1 refseq_viral.20220915.mmsdb* for your further inferences.

refseq_viral.20220915.mmsdb_h:YP_009329195.1 GXT repeat-containing collagen-like protein [Cedratvirus A11]
refseq_viral.20220915.mmsdb.idx:YP_009329195.1 GXT repeat-containing collagen-like protein [Cedratvirus A11]
refseq_viral.20220915.mmsdb.lookup:222094   YP_009329195.1  0

Looking forward to your reply! Thank you, Rory!

rmFlynn commented 2 years ago

Sorry about that, I was thinking you were setting up the database, not moving it. I sometimes answer these on autopilot, so sorry about that. I will look at your specs in a moment but first why are you using DRAM-setup.py set_database_locations --update_description_db instead of just DRAM-setup.py update_description_db?

rmFlynn commented 2 years ago

Updating the descriptions takes up more memory than any other step. Assuming, that you got the output in kilobytes with the free command, you should have enough memory.

To set up without Uniref, without starting from scratch, export your config like this DRAM-setup.py export_config > my_config.txt then find uniref in my_config.txt and set its path to null no ' or " just ...: "bla/bla", "uniref": null, "pfam": "... then save and re-import DRAM-setup.py import_config --config_loc my_config.txt

I think I also need to know how jobs are being run on your system, are you over ssh in screen/tmux or are you running slurm jobs?

HushKuo commented 2 years ago

Updating the descriptions takes up more memory than any other step. Assuming, that you got the output in kilobytes with the free command, you should have enough memory.

To set up without Uniref, without starting from scratch, export your config like this DRAM-setup.py export_config > my_config.txt then find uniref in my_config.txt and set its path to null no ' or " just ...: "bla/bla", "uniref": null, "pfam": "... then save and re-import DRAM-setup.py import_config --config_loc my_config.txt

I think I also need to know how jobs are being run on your system, are you over ssh in screen/tmux or are you running slurm jobs?

That's Cool! I have suceccessfully set up without Uniref under your guidance! update_description_db succeeded and DRAM-v.py annotate worked too! Thank you very much!

But I wonder whether there will be differece on the output annotation information when with or without Uniref. Should I reset the "uniref" path back by re-importing DRAM-setup.py import_config --config_loc my_config.txt , before running DRAM-v.py annotate ?

HushKuo commented 2 years ago

Sorry about that, I was thinking you were setting up the database, not moving it. I sometimes answer these on autopilot, so sorry about that. I will look at your specs in a moment but first why are you using DRAM-setup.py set_database_locations --update_description_db instead of just DRAM-setup.py update_description_db?

Not your fault at all !!!

rmFlynn commented 2 years ago

But I wonder whether there will be differece on the output annotation information when with or without Uniref. Should I reset the "uniref" path back by re-importing DRAM-setup.py import_config --config_loc my_config.txt , before running DRAM-v.py annotate ?

So actually for dram-v the answer should be no, you would only get Uniref results with DRAM-v annotate if you used the --use_uniref flag in the DRAM-v.py annotate ... command, so your current results are the default of DRAM-v. The addition of Uniref would put IDs into your annotations, but not into your distillate. Also, you would need to run update_description_db again, and it would fail. If you did not run update_description_db then annotation would fail.

For the non-viral DRAM.py annotate Uniref plays a small role, In fact I may add a note to the README that it is probably not worth the trouble for most users.

HushKuo commented 2 years ago

But I wonder whether there will be differece on the output annotation information when with or without Uniref. Should I reset the "uniref" path back by re-importing DRAM-setup.py import_config --config_loc my_config.txt , before running DRAM-v.py annotate ?

So actually for dram-v the answer should be no, you would only get Uniref results with DRAM-v annotate if you used the --use_uniref flag in the DRAM-v.py annotate ... command, so your current results are the default of DRAM-v. The addition of Uniref would put IDs into your annotations, but not into your distillate. Also, you would need to run update_description_db again, and it would fail. If you did not run update_description_db then annotation would fail.

For the non-viral DRAM.py annotate Uniref plays a small role, In fact I may add a note to the README that it is probably not worth the trouble for most users.

But I wonder whether there will be differece on the output annotation information when with or without Uniref. Should I reset the "uniref" path back by re-importing DRAM-setup.py import_config --config_loc my_config.txt , before running DRAM-v.py annotate ?

So actually for dram-v the answer should be no, you would only get Uniref results with DRAM-v annotate if you used the --use_uniref flag in the DRAM-v.py annotate ... command, so your current results are the default of DRAM-v. The addition of Uniref would put IDs into your annotations, but not into your distillate. Also, you would need to run update_description_db again, and it would fail. If you did not run update_description_db then annotation would fail.

For the non-viral DRAM.py annotate Uniref plays a small role, In fact I may add a note to the README that it is probably not worth the trouble for most users.

Thank you for your explanation!!! I got it!!!

rmFlynn commented 2 years ago

I hope this is resolved, please re-open if you run into problems again