DaehwanKimLab / centrifuge

Classifier for metagenomic sequences
GNU General Public License v3.0
235 stars 73 forks source link

Unrecognised `--temp-directory` option, however specified in code? #268

Closed jfy133 closed 5 months ago

jfy133 commented 5 months ago

We are encountering a problem with the 'hardcoded' default /tmp when running multiple Centrifuge runs on the same node (different databases); resulting in the following error: (ERR): mkfifo(/tmp/46.inpipe1) failed.

When investigating where /tmp was being specified (i.e. not via an env variable such as$TMPDIR), I noticed that there appears to be an option called --temp-directory where we could presumably modify this.

However it's not listed in --help, and when supplying it to either the centrifuge wrapper or centrifuge-class, I get the following error:

centrifuge-class: unrecognized option '--temp-directory=./tmp/'

Is there any way to specify an alternative tmp/ directory, or could this parameter be exposed to the user?

mourisl commented 5 months ago

How did you specify that? I think you shall use "--temp-directory ./tmp" instead of "--temp-directory==./tmp/".

jfy133 commented 5 months ago

My apologies, that was the last of a range of tests, the full list of variations I tested:

centrifuge -x test-db-centrifuge/taxprofiler_cf -p 2 -1 ERX5474937_ERR5766181_1.fastq.gz -2 ERX5474937_ERR5766181_2.fastq.gz --report-file 2613_pe_ERR5766181_db3.centrifuge.report.txt -S 2613_pe_ERR5766181_db3.centrifuge.results.txt --temp-directory ./tmp

centrifuge-class -x test-db-centrifuge/taxprofiler_cf -p 2 -1 ERX5474937_ERR5766181_1.fastq.gz -2 ERX5474937_ERR5766181_2.fastq.gz --report-file 2613_pe_ERR5766181_db3.centrifuge.report.txt -S 2613_pe_ERR5766181_db3.centrifuge.results.txt --temp-directory ./tmp

centrifuge -x test-db-centrifuge/taxprofiler_cf -p 2 -1 ERX5474937_ERR5766181_1.fastq.gz -2 ERX5474937_ERR5766181_2.fastq.gz --report-file 2613_pe_ERR5766181_db3.centrifuge.report.txt -S 2613_pe_ERR5766181_db3.centrifuge.results.txt --temp-directory="./tmp"

centrifuge-class -x test-db-centrifuge/taxprofiler_cf -p 2 -1 ERX5474937_ERR5766181_1.fastq.gz -2 ERX5474937_ERR5766181_2.fastq.gz --report-file 2613_pe_ERR5766181_db3.centrifuge.report.txt -S 2613_pe_ERR5766181_db3.centrifuge.results.txt --temp-directory="./tmp"

centrifuge -x test-db-centrifuge/taxprofiler_cf -p 2 -1 ERX5474937_ERR5766181_1.fastq.gz -2 ERX5474937_ERR5766181_2.fastq.gz --report-file 2613_pe_ERR5766181_db3.centrifuge.report.txt -S 2613_pe_ERR5766181_db3.centrifuge.results.txt --temp_dir="./tmp"

centrifuge-class -x test-db-centrifuge/taxprofiler_cf -p 2 -1 ERX5474937_ERR5766181_1.fastq.gz -2 ERX5474937_ERR5766181_2.fastq.gz --report-file 2613_pe_ERR5766181_db3.centrifuge.report.txt -S 2613_pe_ERR5766181_db3.centrifuge.results.txt --temp_dir=./tmp 

In all cases the error is the same: /home/james/bin/miniconda3/envs/centrifuge/bin/centrifuge-class: unrecognized option '<PARAMETER_TESTED>=./tmp'

mourisl commented 5 months ago

Thank you for identifying this issue. I think I have fixed it and updated the code on the github repo. Could you please pull the repo down and give it a try?

jfy133 commented 5 months ago

Hi @mourisl I can confirm now that works - thank you very much for the quick fix! I do note it only works for the wrapper, and not for the the 'subcommands' e.g. classification-class, but I don't think that is so important for users in this case. So this works for me :).

When would it be possible to have a patch release of centrifuge with it the fix? For my case I am interested in the tool for a pipeline I maintain, and it uses bioconda (or rather biocontainer) images for getting the software.

mourisl commented 5 months ago

The centrifuge-class doesn't use the tmp folder, the tmp folder is just to hold the pipe files. Is it necessary for the pipe file to be in the specified folder in your case?

I recently noticed an issue in Centrifuge in the work of our new classifier Centrifuger, and my plan is to fix that in the next version. If you need the --temp-directory is necessary in your pipeline, I will draft a small release, like v1.0.4.1.

jfy133 commented 5 months ago

Ah ok - if centrifuge-class doesn't use it that's fine :). Generally yes, our issue is that we are running two (or more) centrifuge commands on the same node in parallel, and we are getting the mkfifo errors because I presume that one of the commands creates the mkfifo before the other (I'm not :100: sure). So I'm assuming if we can isolate the pipe files in separate places we won't have this clash.

So the --temp-directory is indeed necessary for our pipeline, so a patch release would be very gratefully received :). A 1.0.4.1 or 1.0.5 would be perfect. I can also myself handle the bioconda recipe/biocontainer update (Which is where our pipeline pulls centrifuge from)

mourisl commented 5 months ago

I just released the v1.0.4.1. Please let me know if you find other issues.

jfy133 commented 5 months ago

Thank you very very much @mourisl !

I will start the bioconda update procedure tomorrow, and try it in the pipeline. If I keep hitting the problems even when specifing the temp directory I will let you know!

mourisl commented 5 months ago

Thank you! I think the bioconda version will automatically bump up quite soon. For the temp-directory, you need to create that folder first.