mhalushka / miRge3.0

Comprehensive analysis of small RNA sequencing data
MIT License
27 stars 12 forks source link

Error while running miRge3.0 #88

Open CarolineWits opened 10 months ago

CarolineWits commented 10 months ago

Hi, I am new to both NGS analysis and using linex. I have tried installing mirge3.0 onto my linex VM and wanted to try rum my analysis. I used the following input: miRge3.0 -s NGS_Results/RD-ADE-1001_S2_L001_R1_001.fastq.gz,NGS_Results/RD-ADE-840_S1_L001_R1_001.fastq.gz -lib miRge3_Lib -on human -db mirgenedb -o output_dir -bam -a illumina

I got the following error file: Summarizing and tabulating results... Traceback (most recent call last): File "/home/manager/miniconda/envs/ngsbio/bin/miRge3.0", line 10, in sys.exit(main()) File "/home/manager/miniconda/envs/ngsbio/lib/python3.10/site-packages/mirge/main.py", line 166, in main summarize(args, workDir, ref_db, base_names, pdMapped, sampleReadCounts, trimmedReadCounts, trimmedReadCountsUnique) File "/home/manager/miniconda/envs/ngsbio/lib/python3.10/site-packages/mirge/libs/summary.py", line 744, in summarize subpdMapped['miRNA_cbind'] = subpdMapped[['exact miRNA', 'isomiR miRNA']].apply(lambda x: ''.join(x), axis = 1) File "/home/manager/miniconda/envs/ngsbio/lib/python3.10/site-packages/pandas/core/frame.py", line 3968, in setitem self._set_item_frame_value(key, value) File "/home/manager/miniconda/envs/ngsbio/lib/python3.10/site-packages/pandas/core/frame.py", line 4123, in _set_item_frame_value raise ValueError( ValueError: Cannot set a DataFrame with multiple columns to the single column miRNA_cbind

Can you assist? Thank you

arunhpatil commented 10 months ago

Hi @CarolineWits,

Your command looks good, can you check the adapter is actually Illumina? How to check is described here. If you want me to check, then can you share the subset/sample of your input file?

Thank you, Arun

CarolineWits commented 10 months ago

Hi @arunhpatil , Thank you so much, that seems to have worked.
My next problem (sorry I am really new to all this!), I am tryng to perform a differential expression analysis and am using the following command: miRge3.0 -s NGS_Results/RD-ADE-840.fastq.gz,NGS_Results/RD-ADE-1001.fastq.gz,NGS_Results/RD-ASU-006.fastq.gz,NGS_Results/RD-ASU-048.fastq.gz,NGS_Results/RD-EDE-011.fastq.gz,NGS_Results/RD-EJG-061.fastq.gz,NGS_Results/RD-EJG-067.fastq.gz,NGS_Results/RD-LTH-704.fastq.gz,NGS_Results/RD-LTH-783.fastq.gz,NGS_Results/RD-OLF-050.fastq.gz -lib miRge3_Lib -on human -db MirGeneDB -o differential_Exp -a "AGATCGGAAGAGCACACGTCTGAACTCCA" -dex -mdt NGS_Results/DESmetadata.csv

I get the following warnings: Summarizing and tabulating results... /home/manager/miniconda/envs/ngsbio/lib/python3.10/site-packages/mirge/libs/summary.py:764: FutureWarning: The default value of numeric_only in DataFrameGroupBy.sum is deprecated. In a future version, numeric_only will default to False. Either specify numeric_only or select only columns which should be valid for the function. df = df.groupby(['miRNA']).sum()[base_names] /home/manager/miniconda/envs/ngsbio/lib/python3.10/site-packages/mirge/libs/summary.py:769: FutureWarning: The default value of numeric_only in DataFrameGroupBy.sum is deprecated. In a future version, numeric_only will default to False. Either specify numeric_only or select only columns which should be valid for the function. miRNA_df = subpdMapped.groupby(['miRNA_cbind']).sum()[base_names] Summary completed in 4.4172 second(s)

Performing differential expression...

Installing package into ‘/home/manager/R/x86_64-conda-linux-gnu-library/4.0’ (as ‘lib’ is unspecified) Error in contrib.url(repos, type) : trying to use CRAN without setting a mirror Calls: install.packages -> startsWith -> contrib.url In addition: Warning message: In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE, : there is no package called ‘BiocManager’ Execution halted

Can you help with what I have done wrong?

Thank you, Caroline

arunhpatil commented 10 months ago

Hi @CarolineWits,

No problem and ignore the warnings for now. miRge3.0 is looking for R packages for differentiall analysis. It tried to install the packages but couldn't. Can you install the following packages in R command line.

Type R on the command prompt and once R session opens, you can type in the following commands:

This will install BiocManager which is required for DESeq2 installation

if (!require("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

Then install DESeq2:

if (!require("DESeq2", quietly = TRUE))
    BiocManager::install("DESeq2")

Finally, ggplot for graphical assistance: install.packages('ggplot2') To test it has worked, type:

library(DESeq2)
library(ggplot2)

Once you all these installed type q() and you can choose to save y or no n when prompted.

This will be one time installation only and you can then run miRge3.0.

Thank you, Arun