jtamames / SqueezeMeta

A complete pipeline for metagenomic analysis
GNU General Public License v3.0
346 stars 81 forks source link

sqm2zip has stopped working? #813

Closed dutchscientist closed 3 months ago

dutchscientist commented 3 months ago

We are in the middle of doing a lot of samples, and suddenly sqm2zip has stopped working. I have tried it with a dataset working last weekend, and this fails as well.

Ubuntu22, everything up to date. The output is:

_(SqueezeMeta) account@computer01:~/data/name/folder$ sqm2zip.py folder folder_output Traceback (most recent call last): File "/home/account/miniconda3/envs/SqueezeMeta/bin/sqm2zip.py", line 103, in main(parse_args()) File "/home/account/miniconda3/envs/SqueezeMeta/bin/sqm2zip.py", line 77, in main with ZipFile(output, 'w') as outzip: File "/home/account/miniconda3/envs/SqueezeMeta/lib/python3.10/zipfile.py", line 1240, in init self.fp = io.open(file, filemode) FileNotFoundError: [Errno 2] No such file or directory: 'folderoutput/folder.zip

Have tried it on two different computers, reinstalled SqueezeMeta in a fresh conda environment, all giving the same problem. Any thoughts on what this could be? :)

fpusan commented 3 months ago

Does folder_output exist? Note that this is not a folder to be created, but an existing folder in which to place the zip file

fpusan commented 3 months ago

Also if you plan on relying extensively on sqm2zip be aware that the resulting zip file may not be directly loadable in SQMtools, if the project is very large. The right data is however still there (this is a bug in R rather than in SqueezeMeta) so you can get it back and load it after uncompressing the file. See more details and workaround in #755.

dutchscientist commented 3 months ago

Does folder_output exist? Note that this is not a folder to be created, but an existing folder in which to place the zip file

I did that yesterday and it didn't work, and today it works. Must have made a silly typo or so. PEBKAC, I guess...

Thanks for the warning about the large size, I'll make the user aware (ours are ~3.7 GB) :)

eperezv commented 3 months ago

Ubuntu22, everything up to date. The output is:

_(SqueezeMeta) account@computer01:~/data/name/folder$ sqm2zip.py folder folder_output Traceback (most recent call last): File "/home/account/miniconda3/envs/SqueezeMeta/bin/sqm2zip.py", line 103, in main(parse_args()) File "/home/account/miniconda3/envs/SqueezeMeta/bin/sqm2zip.py", line 77, in main with ZipFile(output, 'w') as outzip: File "/home/account/miniconda3/envs/SqueezeMeta/lib/python3.10/zipfile.py", line 1240, in init self.fp = io.open(file, filemode) FileNotFoundError: [Errno 2] No such file or directory: 'folderoutput/folder.zip

I'm having the same issue. I noticed that sqm2zip used a lot of ram (128 Gb was not enough and needed extra ca 80 Gb from swap).

fpusan commented 3 months ago

Works for me, but the output directory needs to exist (maybe a bit down, I can fix It in the Next version). So just make sure that folder_output exists. I didn't expect such a high RAM usage. How many ORFs do you have? Also do you also see this high RAM usage when running sqm2tables.py on that project?

eperezv commented 3 months ago

It is solved after creating the folder_output. Thanks.

I have 18M ORFs. I didn't check when running sqm2tables, I will update after trying