davidemms / OrthoFinder

Phylogenetic orthology inference for comparative genomics
https://davidemms.github.io/
GNU General Public License v3.0
686 stars 186 forks source link

Non-default output directory should be used with -b #846

Open SalvadorGJ opened 1 year ago

SalvadorGJ commented 1 year ago

Hi,

I'm trying to recycle an OrthoFinder analysis generated with recycled results, similar as in #670. The difference is that I want to run in parallel multiple analysis, adding in each a different version of a proteome of just one new species. To manage it I stablished independent directories for each version in which I execute the following command:

orthofinder -t 32 -M msa -y -b Results_May31 -f primary_transcripts/ -o AmexT_v50_FullOnly_stringtie_all.OrthoFinder_Results

Where Results_May31 is a symbolic link to the previous results of Orthofinder performed over the proteomes of other 30 species; primary_transcripts/ contains only one FASTA file corresponding to the version of the transcriptome I want to test; and I wanted to establish the output directory at the current directory, and name it as ${Proteome_Version}.OrthoFinder_Results. Unfortunately I got this log:

OrthoFinder version 2.5.4 Copyright (C) 2014 David Emms

ERROR: Incompatible arguments, -o (non-default output directory) can only be used with a new OrthoFinder run using option '-f'
ERROR: An error occurred, ***please review the error messages*** they may contain useful information about the problem.

So, I tried again but without indicating the Non-default output directory, as follows:

orthofinder -t 32 -M msa -y -b Results_May31 -f primary_transcripts/

The issue is that now the default output directories of all the parallel job are generating in the same directory, the one that the symbolic link points, and not in the corresponding directory where the command was executed. As I'm using a workflow manager to parallelize the jobs, is really important to control the place where output will be generated. Right now I'm dealing with it by copying the previous results on the directory where the command will be executed, but that takes a lot of time.

So, I want to ask first if I'm doing something wrong to achieve my goal. If this behavior is expected I want to ask if its possible to increase the freedom to specify the intended output directory while using the -b option.

Thanks in advance, Salvador

SalvadorGJ commented 1 year ago

In the meantime, I solved the time consuming problem by compressing as tar the previous Orthofinder's results, and then copy and extract it in the execution directory for each proteome I want to test. In any case, it will be very helpful to have the chance of choosing the non-default directory name when caching previous results. For example, I would establish a dynamic output name for the directory depending on the proteome version.

Best, Salvador

SalvadorGJ commented 7 months ago

Dear @davidemms

I want to kindly ask if there are any news regarding this issue.

Best, Salvador