carlferr / HOME-BIO

HOME-BIO (sHOtgun MEtagenomic analysis of BIOlogical entities)
GNU Lesser General Public License v3.0
7 stars 2 forks source link

Issues in downloading from Zenodo #2

Closed BioH4z closed 1 year ago

BioH4z commented 2 years ago

Our Database on Zenodo (10.5281/zenodo.4055180).

Users are experiencing difficulties in downloding it. Everytime, the download starts but it stops at 20-30%. The issue is present even if the users downloads it from web or from command line ("server error" or "connection refused").

We are trying several solutions.

flaviode commented 2 years ago

I tried to store the database in an exFAT driver too, but it does not work as well. Hoping you could fix this issue. Thanks

BioH4z commented 2 years ago

Dear user, we tried to find a solution with Zenodo but we were not able yet. Unfortunatly, the main problem is the big size of the files. For this reason, it needs few fluctuations in the internet speed to stop the download and give the errors. We decided to temporarily move the database on Google Drive. This is not the best solution but (we hope) the fastest one.

In the next days, we will provide you a Google Drive link. After the owner will accept your request, you will be able to download the files.

However, it is also possible to generate your own indexed files following the instructions in our tutorial.

We will provide the link as fast as possible.

BioH4z commented 2 years ago

Dear user, we share with you this Google Drive link: https://drive.google.com/drive/folders/1ESDe-WVX2XJ0eLHz6AjGhZBu1IfKO64L?usp=sharing.

Click on it and it will send a notification to the owner of the folder. He will accept you later. Please, download the entire folders and let us know if everything works fine.

We are planning to change the files uploaded on Zenodo in order to be more easy to download. Stay tuned for more updates.

flaviode commented 2 years ago

Hi there, Finally, I downloaded the db, thanks for the link! Everything seems to be smoothly ok. The QC steps are fine. However, when it starts the bacteria analysis, it returns:

Running Shotgun Analysis Running kraken for bacteria Loading database information...Failed attempt to allocate 33174075976bytes; you may not have enough free memory to load this database. If your computer has enough RAM, perhaps reducing memory usage from other programs could help you load this database? classify: unable to allocate hash table memory Error: cannot create Kraken file

I have 16G RAM, and the background processes take no more than 6G. How can I deal with this? Protozoa and viruses work fine.
Thank for your help

BioH4z commented 2 years ago

Dear user, unfortunately Kraken2 tries to load the entire database on RAM as written on Kraken2 manual (To run efficiently, Kraken 2 requires enough free memory to hold the database (primarily the hash table) in RAM. While this can be accomplished with a ramdisk, Kraken 2 will by default load the database into process-local RAM; the --memory-mapping switch to kraken2 will avoid doing so.)

The only way to solve this issue, is to create a RAMdisk on your PC (there are many free softwares), move the database on that disk and then modify the script.

In the line 244 and 246 of Script.py you should just add "--memory-mapping" after "os.system("kraken2". It will be something like this: os.system("kraken2 --memory-mapping --db Db_Kraken2_Kaiju_bacteria ...

Remember to move the Db_Kraken2_Kaiju_bacteria on the RAMdisk and to change the path in the config_file.txt. Please, note that this will increase the time used to perform the analysis.

We understand that it is not an easy task and that's why we usually do not suggest to do that, but, unfortunately, this is the only solution that is available for PC with less RAM.

flaviode commented 2 years ago

Many thanks for your suggestions! I will try! Meanwhile, I used a pre-built MiniKraken2_v2_8GB dataset available at the kraken2 website, and it is fine! Maybe I would lose in resolution, but the analysis is over. I'll turn back to you as I complete the process you suggest. Thanks for your valuable help!