Closed NgianZhenKai closed 1 year ago
Hi @NgianZhenKai,
This looks like it's either not using the binary executable file bowtie
or unable to access the miRge3.0 library file. Easy way to check is by the following two steps:
Execute the following commands on the terminal:
/data/apps/bowtie-1.3.0/bowtie -h
: Now if this return help menu/options, then bowtie executable is fine. If it throws an error then it is most likely that path for bowtie is incorrect. How have you installed miRge3.0, by bioconda or pip? If it is bioconda, you don't need to use the option -pbwt
./data/apps/bowtie-1.3.0/bowtie -x /data/ChintongLab/zhenkai/APOE_miRNAseq/human_miRNA_library/human/index.Libs/human_mirna_MirGeneDB -n 0 -f --norc -S --threads 8 /data/ChintongLab/zhenkai/APOE_miRNAseq/Alignment/miRge.2021-06-17_05-12-50/bwtInput.fasta
: Assuming your bowtie path is correct, then you need execute this, see what error is shown this time. Also, check if the bwtInput.fasta
has any sequences in FASTA format. You can do that by either using head
or less
command on the terminal. Let us know what error is shown. This will help us resolve this issue further.
Thank you, Arun.
Hi @arunhpatil
Thank you very much for your kind assistance. I have tried out the different troubleshooting steps you advised.
1.) Bowtie return help menu/options nicely, so the bowtie executable is fine 2.) I tried to rerun the command (from point 2), and the error states it could not locate Bowtie index corresponding to mirge3_Lib. 3.) Therefore I tried to redownload the mirge3_lib and run the script again. This time the alignment worked, but I got an error at the subsequent step:
/data/apps/python38/bin/miRge3.0 -pbwt /data/apps/bowtie-1.3.0 -s /data/ChintongLab/zhenkai/APOE_miRNAseq/Raw_fq/NSC_miCT1.fastq,/data/ChintongLab/zhenkai/APOE_miRNAseq/Raw_fq/NSC_miCT2.fastq,/data/ChintongLab/zhenkai/APOE_miRNAseq/Raw_fq/NSC_miKO1.fastq,/data/ChintongLab/zhenkai/APOE_miRNAseq/Raw_fq/NSC_miKO2.fastq -lib /data/apps/python38/miRge3_Lib -on human -db mirgenedb -o /data/ChintongLab/zhenkai/APOE_miRNAseq/Alignment -gff -nmir -trf -ai -cpu 8 -a illumina bowtie version: 1.3.0 cutadapt version: 3.4 Samtools version: 1.2 RNAfold version: 2.1.9 Collecting and validating input files...
miRge3.0 will process 4 out of 4 input file(s).
Cutadapt finished for file NSC_miCT1 in 55.6237 second(s) Collapsing finished for file NSC_miCT1 in 0.7568 second(s) Cutadapt finished for file NSC_miCT2 in 54.6685 second(s) Collapsing finished for file NSC_miCT2 in 6.1647 second(s) Cutadapt finished for file NSC_miKO1 in 47.6461 second(s) Collapsing finished for file NSC_miKO1 in 8.2722 second(s) Cutadapt finished for file NSC_miKO2 in 82.5348 second(s) Collapsing finished for file NSC_miKO2 in 12.4673 second(s) Matrix creation finished in 2.419 second(s)
Data pre-processing completed in 272.8011 second(s)
Alignment in progress ... Alignment completed in 84.5381 second(s)
Summarizing and tabulating results...
Traceback (most recent call last):
File "/data/apps/python38/bin/miRge3.0", line 8, in
May I kindly know how to resolve the following error shown? I really appreciate your time and valuable help, thank you!
Hi @NgianZhenKai,
Yes, everything seems fine, except that there is a conflict with pandas and numpy version, to rule out this possibility I want you to upgrade pandas and try the command again. You can upgrade it as shown below (python3.7 if you are using py37):
python3.8 -m pip install --user --upgrade pandas
python3.8 -m pip install --user --upgrade numpy
or
pip3 install pandas --upgrade
pip3 install numpy --upgrade
Please let me know if this resolves the issue.
Thank you, Arun.
Hi @arunhpatil
I have updated pandas and numpy and it worked nicely, but I got another error for samtools.
bowtie version: 1.3.0 cutadapt version: 3.4 Samtools version: 1.2 RNAfold version: 2.1.9 Collecting and validating input files...
miRge3.0 will process 4 out of 4 input file(s).
Cutadapt finished for file NSC_miCT1 in 54.787 second(s) Collapsing finished for file NSC_miCT1 in 0.818 second(s)
Cutadapt finished for file NSC_miCT2 in 54.4725 second(s) Collapsing finished for file NSC_miCT2 in 6.1347 second(s)
Cutadapt finished for file NSC_miKO1 in 47.3628 second(s) Collapsing finished for file NSC_miKO1 in 7.8998 second(s)
Cutadapt finished for file NSC_miKO2 in 84.4352 second(s) Collapsing finished for file NSC_miKO2 in 12.8581 second(s)
Matrix creation finished in 2.0609 second(s)
Data pre-processing completed in 273.0259 second(s)
Alignment in progress ... Alignment completed in 84.8627 second(s)
Summarizing and tabulating results... The number of A-to-I editing sites is less than 10 so that no heatmap is drawn. Summary completed in 18.1014 second(s)
Predicting novel miRNAs
Performing prediction of novel miRNAs... Start to predict
Traceback (most recent call last):
File "/data/apps/python38/bin/miRge3.0", line 8, in
So sorry to trouble you thank you once again for your time!
Hi @NgianZhenKai,
It is no trouble at all. These things happen most often. I would recommend using bioconda installation process. Conda allows for the creation and use of environments, and this way it is a one-step installation process (without interfering with systems other dependencies/tools). But nevertheless, pip is great too, occasionally we come across few errors like these.
To resolve this, lets try to run the following commands:
less /data/ChintongLab/zhenkai/APOE_miRNAseq/Alignment/miRge.2021-06-26_02-51-33/unmapped_tmp/unmapped_mirna_NSC_miCT1_vs_genome.sam
samtools sort --threads 8 -O sam -T sample.sort -o /data/ChintongLab/zhenkai/APOE_miRNAseq/Alignment/miRge.2021-06-26_02-51-33/unmapped_tmp/unmapped_mirna_NSC_miCT1_vs_genome_sorted.sam /data/ChintongLab/zhenkai/APOE_miRNAseq/Alignment/miRge.2021-06-26_02-51-33/unmapped_tmp/unmapped_mirna_NSC_miCT1_vs_genome.sam
Thank you, Arun
Dear @arunhpatil
[zhenkai@tll-rv1 ~]$ less /data/ChintongLab/zhenkai/APOE_miRNAseq/Alignment/miRge.2021-06-26_02-51-33/unmapped_tmp/unmapped_mirna_NSC_miCT1_vs_genome.sam @HD VN:1.0 SO:unsorted @SQ SN:chr1 LN:248956422 @SQ SN:chr2 LN:242193529 @SQ SN:chr3 LN:198295559 @SQ SN:chr4 LN:190214555 @SQ SN:chr5 LN:181538259 @SQ SN:chr6 LN:170805979 @SQ SN:chr7 LN:159345973 @SQ SN:chr8 LN:145138636 @SQ SN:chr9 LN:138394717 @SQ SN:chr10 LN:133797422 @SQ SN:chr11 LN:135086622 @SQ SN:chr12 LN:133275309 @SQ SN:chr13 LN:114364328 @SQ SN:chr14 LN:107043718 @SQ SN:chr15 LN:101991189 @SQ SN:chr16 LN:90338345 @SQ SN:chr17 LN:83257441 @SQ SN:chr18 LN:80373285 @SQ SN:chr19 LN:58617616 @SQ SN:chr20 LN:64444167 @SQ SN:chr21 LN:46709983 @SQ SN:chr22 LN:50818468 @SQ SN:chrX LN:156040895 @SQ SN:chrY LN:57227415 @SQ SN:chrM LN:16569 @SQ SN:chr1_KI270706v1_random LN:175055 @SQ SN:chr1_KI270707v1_random LN:32032 @SQ SN:chr1_KI270708v1_random LN:127682
sort: invalid option -- '-' Usage: samtools sort [options...] [in.bam] Options: -l INT Set compression level, from 0 (uncompressed) to 9 (best) -m INT Set maximum memory per thread; suffix K/M/G recognized [768M] -n Sort by read name -o FILE Write final output to FILE rather than standard output -O FORMAT Write output as FORMAT ('sam'/'bam'/'cram') (either -O or -T PREFIX Write temporary files to PREFIX.nnnn.bam -T is required) -@ INT Set number of sorting and compression threads [1]
Legacy usage: samtools sort [options...]
Thank you very much!
Dear @arunhpatil
I have tried to update the Samtools version and it works nicely now as shown below.
bowtie version: 1.3.0 cutadapt version: 3.4 Samtools version: 1.4 RNAfold version: 2.1.9 Collecting and validating input files...
miRge3.0 will process 4 out of 4 input file(s).
Cutadapt finished for file NSC_miCT1 in 51.9033 second(s) Collapsing finished for file NSC_miCT1 in 1.0319 second(s)
Cutadapt finished for file NSC_miCT2 in 51.2453 second(s) Collapsing finished for file NSC_miCT2 in 8.3537 second(s)
Cutadapt finished for file NSC_miKO1 in 44.422 second(s) Collapsing finished for file NSC_miKO1 in 11.802 second(s)
Cutadapt finished for file NSC_miKO2 in 79.6492 second(s) Collapsing finished for file NSC_miKO2 in 16.2678 second(s)
Matrix creation finished in 1.6499 second(s)
Data pre-processing completed in 268.4421 second(s)
Alignment in progress ... Alignment completed in 83.081 second(s)
Summarizing and tabulating results... The number of A-to-I editing sites is less than 10 so that no heatmap is drawn. Summary completed in 16.8916 second(s)
Predicting novel miRNAs
Performing prediction of novel miRNAs... Start to predict No cluster sequences are generated and prediction is aborted. Prediction of novel miRNAs Completed (59.14 sec)
The analysis completed in 431.1037 second(s)
However, I find that most of the reads have been filtered away, and there's very few mapped reads to miRNA. May I kindly know base on your experience, if it's a trimming problem or problem in my sequencing files. Below are the link for the summary report.
file:///C:/Users/zhenkai/Desktop/Apoe%20RNAseq/Jiazi%20micro-RNAseq/miRge.2021-06-28_02-56-35/annotation.report.html file:///C:/Users/zhenkai/Desktop/Apoe%20RNAseq/Jiazi%20micro-RNAseq/miRge.2021-06-28_02-56-35/miRge3_visualization.html
Thanks once again!
Hi @NgianZhenKai,
This is great. I can't see the HTML files you sent, but I get the picture of what you stated. This could most probably be the use of an incorrect adapter sequence. The quick way to check is to perform one of the following:
>hsa-let-7a-5p
TGAGGTAGTAGGTTGTATAGTT
$ grep "TGAGGTAGTAGGTTGTATAGTT" NSC_miCT2.fastq | less
Sometimes, there might be 4 bp barcodes along with the adapters and/or Unique Molecular Identifiers (UMIs) based on the experiment design. If you want me to help you figure out the same, then please send me a sample/subset of your FASTQ file to my Gmail.
Thank you, Arun.
Hi @arunhpatil
I have tried the instructions you suggested and compile them in the attached word document.
I have also shared one of the fastq files to your gmail so that you can use for troubleshooting.
Thank you very much for your advise and help!
Best Regards, Zhen Kai Troubleshooting miRge3.0.docx
Hi @NgianZhenKai,
I did the grep and the adapter sequence is "AGATCGGAAGAGCACACGTCTGAACTCC"
at the 3' end and not illumina adapter ("TGGAATTCTCGGGTGCCAAGGAACTCCAG")
. You might be wondering that the sequencing was done on Illumina platorm, so therefore the adapter must be illumina. However, the person who prepared library might have used a different adapter sequence, for this instance it is "AGATCGGAAGAGCACACGTCTGAACTCC". Also, according to what we have encountered, most of the times it is illumina adapter sequence, however, it is always a good pracitce to verify the adapter sequences before we perform any analysis.
Please replace the adapter option in your command as shown below:
" -a AGATCGGAAGAGCACACGTCTGAACTCC "
Thank you, Arun.
Hi @arunhpatil
Thank you for your swift response as always. I have replace the adapter option command and think the trimming is now resolved as I have more aligned reads. However, I still face another error as shown below.
bowtie version: 1.3.0 cutadapt version: 3.4 Samtools version: 1.4 RNAfold version: 2.1.9 Collecting and validating input files...
miRge3.0 will process 4 out of 4 input file(s).
Cutadapt finished for file NSC_miCT1 in 50.1478 second(s) Collapsing finished for file NSC_miCT1 in 0.6804 second(s)
Cutadapt finished for file NSC_miCT2 in 50.1547 second(s) Collapsing finished for file NSC_miCT2 in 5.1365 second(s)
Cutadapt finished for file NSC_miKO1 in 43.1387 second(s) Collapsing finished for file NSC_miKO1 in 7.3246 second(s)
Cutadapt finished for file NSC_miKO2 in 76.7487 second(s) Collapsing finished for file NSC_miKO2 in 8.3266 second(s)
Matrix creation finished in 1.1774 second(s)
Data pre-processing completed in 246.6504 second(s)
Alignment in progress ... Alignment completed in 85.3076 second(s)
Summarizing and tabulating results... The number of A-to-I editing sites is less than 10 so that no heatmap is drawn. Summary completed in 648.0974 second(s)
Predicting novel miRNAs
Traceback (most recent call last):
File "/data/apps/python38/bin/miRge3.0", line 8, in
Thank you for your patience once again.
Best, Zhen Kai
Hi @NgianZhenKai,
I was able to successfully run your parameters for the file you shared across (NSC_miCT1). I will learn more if you could try with just this file and also try -db
with both miRGeneDB and miRBase separately. In the meanwhile, I will try to fix this issue.
Thank you, Arun.
Hi @arunhpatil
I have tried both -db miRGeneDB or miRBase separately on NSC_miCT1 file only, but still got back the same error in both cases:
upPrecusor = wholePrecusorNameContentDic[clusterName+':precusor_1'] KeyError: 'NSC_miCT1:miRCluster_2_25:chr1:630751_630775+:precusor_1'
In the meantime, I have asked my administrator to help me install a later version of samtools (v1.9), as I saw others have used more updated versions in the other threads. I will update you once I try and hopefully it can resolve the issue, thank you!
Dear @arunhpatil
I have tried to update samtools but still gotten the same error as seen previously. I think it is stuck at predicting novel miRNA.
bowtie version: 1.3.0 cutadapt version: 3.4 Samtools version: 1.12 RNAfold version: 2.1.9 Collecting and validating input files...
miRge3.0 will process 1 out of 1 input file(s).
Cutadapt finished for file NSC_miCT1 in 50.2071 second(s) Collapsing finished for file NSC_miCT1 in 0.5735 second(s)
Matrix creation finished in 0.3742 second(s)
Data pre-processing completed in 52.0354 second(s)
Alignment in progress ... Alignment completed in 31.9073 second(s)
Summarizing and tabulating results... The number of A-to-I editing sites is less than 10 so that no heatmap is drawn. Summary completed in 223.8272 second(s)
Predicting novel miRNAs
Performing prediction of novel miRNAs... Start to predict
Traceback (most recent call last):
File "/data/apps/python38/bin/miRge3.0", line 8, in
Nonetheless, when I remove -nmir option, the analysis run smoothly, so I will be using those data to run my analysis for now. Thank you very much once again for your help!
Hi @NgianZhenKai,
Let me look at this issue in detail and if I can figure it out, I will fix this in the code and will be available in the next update. Thank you for bringing this to our attention.
Hi there,
I am trying to run miRge3 on 4 human miRNAseq samples. I am new to bioinformatics and self-learn very basic commands. I tried to check up online, but I am not sure how to understand and resolve the bowtie error as shown below. I would greatly appreciate if someone could explain what the error mean and how to proceed from there, thank you very much.
!/bin/sh
$ -V
$ -cwd
Specify mpi
$ -pe smp 8
miRge3.0
/data/apps/python38/bin/python3.8 /data/apps/python38/bin/miRge3.0 -pbwt /data/apps/bowtie-1.3.0 -s /data/ChintongLab/zhenkai/APOE_miRNAseq/Raw_fq/NSC_miCT1.fastq,/data/ChintongLab/zhenkai/APOE_miRNAseq/Raw_fq/NSC_miCT2.fastq,/data/ChintongLab/zhenkai/APOE_miRNAseq/Raw_fq/NSC_miKO1.fastq,/data/ChintongLab/zhenkai/APOE_miRNAseq/Raw_fq/NSC_miKO2.fastq -lib /data/ChintongLab/zhenkai/APOE_miRNAseq/human_miRNA_library -on human -db mirgenedb -o /data/ChintongLab/zhenkai/APOE_miRNAseq/Alignment -gff -nmir -trf -ai -cpu 8 -a illumina
bowtie version: 1.3.0 cutadapt version: 3.4 Samtools version: 1.2 RNAfold version: 2.1.9 Collecting and validating input files...
miRge3.0 will process 4 out of 4 input file(s).
Cutadapt finished for file NSC_miCT1 in 55.7864 second(s) Collapsing finished for file NSC_miCT1 in 0.7193 second(s)
Cutadapt finished for file NSC_miCT2 in 54.2349 second(s) Collapsing finished for file NSC_miCT2 in 6.0364 second(s)
Cutadapt finished for file NSC_miKO1 in 47.697 second(s) Collapsing finished for file NSC_miKO1 in 8.1637 second(s)
Cutadapt finished for file NSC_miKO2 in 84.2334 second(s) Collapsing finished for file NSC_miKO2 in 15.4963 second(s)
Matrix creation finished in 2.3941 second(s)
Data pre-processing completed in 276.9964 second(s)
Alignment in progress ...
Traceback (most recent call last): File "/data/apps/python38/bin/miRge3.0", line 8, in
sys.exit(main())
File "/data/apps/python38/lib/python3.8/site-packages/mirge/main.py", line 105, in main
pdDataFrame = bwtAlign(args,pdDataFrame,workDir,ref_db)
File "/data/apps/python38/lib/python3.8/site-packages/mirge/libs/manifoldAlign.py", line 100, in bwtAlign
alignPlusParse(bwtExec, bwt_iter, pdDataFrame, args, workDir)
File "/data/apps/python38/lib/python3.8/site-packages/mirge/libs/manifoldAlign.py", line 19, in alignPlusParse
bowtie = subprocess.run(str(bwtExec), shell=True, check=True, stdout=subprocess.PIPE, text=True, stderr=subprocess.PIPE, universal_newlines=True)
File "/data/apps/python38/lib/python3.8/subprocess.py", line 512, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '/data/apps/bowtie-1.3.0/bowtie -x /data/ChintongLab/zhenkai/APOE_miRNAseq/human_miRNA_library/human/index.Libs/human_mirna_MirGeneDB -n 0 -f --norc -S --threads 8 /data/ChintongLab/zhenkai/APOE_miRNAseq/Alignment/miRge.2021-06-17_05-12-50/bwtInput.fasta' returned non-zero exit status 1.