mhalushka / miRge

miRge - microRNA alignment software for small RNA-seq data, now at v2.0
GNU General Public License v3.0
27 stars 14 forks source link

Installation details #30

Open bchatelet opened 4 years ago

bchatelet commented 4 years ago

Hi, I am near a newbie. "miRge2.0 relies on a huge number of libraries like: Bowtie indexes of genome, hairping, mature miRNAs in miRBase, mature miRNAs in miRGeneDB, mRNA, rRNA, snoRNA, mature tRNA, primary tRNA, other ncRNA and spike-in sequences (optional) Sequences of genome, mature miRNAs (including SNP information) in miRBase and miRGeneDB Corrdinates of repetitive elements and mature miRNAs in the genome and miRNA merge information in miRBase and MirGeneDB" I don't know how to download, index theses libraries. I have an error message: "print 'The bowtie index file of %s_%s.*.ebwt is not located at %s, please check it.'%(species, type, indexPath)"

Could you be more specific? Regards

mhalushka commented 4 years ago

Sorry for the delay in getting back to you. The link to the libraries broke when we had to move storage systems. What species are you interested in and I will get you the correct link? I think once you get the libraries downloaded, the rest will work out.

DeevanshuGoyal commented 3 years ago

Hi,

Post installation of miRge2.0, I tried to run the following command:

miRge2.0 --help

The response was as follows:

Traceback (most recent call last):   File "/home/rcodaio/anaconda3/bin/miRge2.0", line 5, in     from mirge.main import main   File "/home/rcodaio/anaconda3/lib/python3.8/site-packages/mirge/main.py", line 266     print 'The bowtie index file of %s_%s.*.ebwt is not located at %s, please check it.'%(species, type, indexPath)

  1. Is this missing the reference genome and if yes, the link to the human genome seems to be broken? Can you pleas help me with that>
  2. Further, if this is not the root cause, can you please help with the troubleshooting here? Regards
arunhpatil commented 3 years ago

@DeevanshuGoyal,

Hi. Yes, it is missing the reference genome. We now have the latest version of the pipeline, miRge3.0. Please find the documentation for installation and usage here.

Currently, miRge3.0 libraries are available from SourceForge and we are moving the libraries to AWS soon. In case if you have trouble accessing the libraries, please contact us, we can share the link over email.

DeevanshuGoyal commented 3 years ago

@arunhpatil

Thanks for the response. This is helpful. I installed the new pipeline and I had a few questions regarding the working:

  1. Can miRge3.0 be used without installing the GUI component? I'd like to work purely on Command Line code on Debian.
  2. I tried to run the following command and got the response as mentioned below:

(biospace) rcodaio@DESKTOP-BCBFUCM:/mnt/d/miRge3_Lib$ miRge3.0 -s SRR1789566.fastq,SRR1789565.fastq,SRR1789564.fastq -lib miRge3_Lib -on human -db mirgenedb -o output_dir -gff -nmir -trf -ai -cpu 12 -a illumina bowtie version: 1.3.0 cutadapt version: 3.4 Samtools version: 1.12 RNAfold version: 2.4.14 Collecting and validating input files...

WARNING: File SRR1789566.fastq does not exists! Omitting file SRR1789566.fastq

WARNING: File SRR1789565.fastq does not exists! Omitting file SRR1789565.fastq

WARNING: File SRR1789564.fastq does not exists! Omitting file SRR1789564.fastq

ERROR!: No valid input files were available! Please verify miRge -s arguments

My FASTQ files are located in the miRge3Lib (where I extracted my human genome tar file) and the mirGeLib folder is in a different directory than the miRge3.0 tool. Where does the tool take its inputs from, since I can see no particular command relevant to that.

arunhpatil commented 3 years ago

@DeevanshuGoyal,

Hi, yes, miRge3.0 supports both the command-line interface and GUI, therefore, you can work without a GUI component. You are executing miRge3.0 in the incorrect directory. Since the files are located in miRge3Lib, you should change the directory where the input files are located. For example, change the directory to miRge3Lib, and then run miRge3.0: (biospace) rcodaio@DESKTOP-BCBFUCM:/mnt/d/miRge3Lib$ miRge3.0 -s SRR1789566.fastq,SRR1789565.fastq,SRR1789564.fastq -lib miRge3_Lib -on human -db mirgenedb -o output_dir -gff -nmir -trf -ai -cpu 12 -a illumina

Your execution should be in the present directory, the same as the path of the input files, or you may have to provide the absolute path for each of those raw files. I hope this is helpful and please let us know if you need further assistance.

Also, please note, the folder names are different, miRge3Lib and miRge3_Lib. If you have extracted human libraries in miRge3_Lib and your FASTQ files are in miRge3Lib, then you should change the directory to miRge3Lib and provide the corresponding path for libraries in -lib miRge3_Lib. Otherwise, the command and the log file look perfect.

DeevanshuGoyal commented 3 years ago

@arunhpatil

Apologies but I think I had a typo in the original query. My directory name is miRge3_Lib which contains both the reference human genome and the input files. However, even if I am running the code while I am within the directory, it says that the files do not exist. Hence, the confusion. Could you help me here? Do I need to add this directory to the path or is there any way to provide the absolute path of the input files to the command via the CLI?

DeevanshuGoyal commented 3 years ago

@arunhpatil

I figured out the issue on my end and it's working. Thank you so much for all the help!

arunhpatil commented 3 years ago

@DeevanshuGoyal. Great!! Awesome. Please don't hesitate to contact us if you have any further queries.