PalMuc / TransPi

TransPi – a comprehensive TRanscriptome ANalysiS PIpeline for de novo transcriptome assembly
Other
26 stars 14 forks source link

Problem in installing #4

Closed silenus092 closed 3 years ago

silenus092 commented 3 years ago

Hi,

Thank you for providing such a very beneficial tool

However, I try to install the program and I faced an error while I install it, I attach the screenshot

image

I select EUKARYOTA, and then I select Superkingdom, suddenly the program stops with the error message above.

    Please select database: 1
cat: ./conf/busV4list.txt: No such file or directory
cat: ./conf/busV4list.txt: No such file or directory
cat: ./conf/busV4list.txt: No such file or directory
cat: ./conf/busV3list.txt: No such file or directory

         -- No BUSCO V3 available for  --

Here are the commands I used so far

git clone https://github.com/palmuc/TransPi.git
cd TransPi
bash precheck_TransPi.sh .
rivera10 commented 3 years ago

Hello,

To solve this use the PATH instead of .

Example:

bash precheck_TransPi.sh /home/ubuntu/TransPi

I'll try to find a way how to solve this for future updates.

Thanks

silenus092 commented 3 years ago

Thank you I now made it runnable, btw I have a small question How to select Kmers number? I saw the list of number in GitHub doc. or in your paper for example ; --k 25,41,53 --maxReadLen 75 However, still, I do not fully understand how to related these (e.g., 25,41,53 )number to the sequence length

If I have

file                format  type    num_seqs        sum_len  min_len  avg_len  max_len
SRR1552488_1.fastq  FASTQ   DNA   85,461,959  4,273,097,950       50       50       50
SRR1552488_2.fastq  FASTQ   DNA   85,461,959  4,102,174,032       48       48       48

so It can be? --k 25,31,35 --maxReadLen 50

rivera10 commented 3 years ago

Selection of kmers is based on your read length. However, no matter the length, try to have a combination of "short" and "long" kmers. Short kmers generate more transcripts (but with more errors), and long kmers generate less transcripts (but with less misassemblies). For 50bp I often used --k 25,33,37. I have colleagues that tried different combinations (e.g. 25,33,39) and the results do not change much because we are using a multi assembler approach.

Guarup commented 3 years ago

The tool looks excellent, I would like to implement it in my analysis: I followed the installation tutorial, including the installation of the metazoa database, I got the following: bin docs precheck_TransPi.sh template.nextflow.config conf LICENSE README.md transpi_env.yml DBs nextflow remove_failed.sh TransPi.nf Dockerfile nextflow.config scripts update_databases.sh And when running the script with the location of my samples: nextflow run TransPi.nf --all --maxReadLen 150 --k 25,35,55,75 --reads /PATH/filter_samples/*_R[1,2].fastq.gz -profile conda --myConda The following error was reported: nextflow: command not found I would be grateful if you could please guide me in resolving the problem.

rivera10 commented 3 years ago

Hello @Guarup,

Did you use the precheck script? It looks like you are missing nextflow program. I'll suggest to run the precheck again to make sure all the requirements are installed. Let me know how it goes.

Best, Ramon

rivera10 commented 3 years ago

Also, it could be that you have nextflow installed locally in the directory (done with the precheck) and need to add the ./ before the name to execute the program (e.g ./nextflow ...).