fallerlab / ARF

Differential Ribosomal heterogeneity Prediction through Ribosome Profiling generated rRNA fragments and their Proximity to Ribosomal Proteins
GNU General Public License v3.0
6 stars 2 forks source link

<simpleError in dim(ordered) <- ns: dims [product 1] do not match the length of object [0]> #5

Closed nayanvs closed 1 year ago

nayanvs commented 1 year ago

<simpleError in dim(ordered) <- ns: dims [product 1] do not match the length of object [0]> Error in dripARF_result_heatmap(dripARF_results = dripARF_results, targetDir = targetDir, : object 'last' not found

I solved the initial issue, sample.tsv requires more than three columns unline mentioned in the readme file. test data runs well with qvalue fix but other errors occur with other published datasets.

ferhatalkan commented 1 year ago

Hi, I'm not sure about the error you reported. Can you share how many replicates per group you had in the analysis? One issue is usually if you have a single replicate group, several errors appear down the line. Can you also make it clear which other datasets produced which errors so that I can also take a look at them?

nayanvs commented 1 year ago

Hi! Thank you for your reply. I have three replicates per group and a total of six samples. I made a sample.tsv file as comparable as possible to test datasets given with the package to check if the error is related to .tsv files. But error still persist.

head(sample.tsv) sampleName bedGraphFile group taggedRP ribosomeIsolation TTX1 test_data/aS1.bedGraph TTX uL1_w_FLAG FLAG TTX2 test_data/aS2.bedGraph TTX uL1_w_FLAG FLAG TTX3 test_data/aS3.bedGraph TTX uL1_w_FLAG FLAG cLTP1 test_data/aE1.bedGraph cLTP eL22_w_HA HA cLTP2 test_data/aE2.bedGraph cLTP eL22_w_HA HA cLTP3 test_data/aE3.bedGraph cLTP eL22_w_HA HA

ferhatalkan commented 1 year ago

Thank you. I need to know a few more things though. Can you share which organism your data is from and what is your output when u run 'cut -f1 YOUR_BEDGRAPH_FILE | sort | uniq -c' on your bedGraph files?

nayanvs commented 1 year ago

Thank you for your reply. The organism is "mouse" (your inbuilt reference)

nayan@NGS: ~/Rdirectory/dripARF_results$ cut -f1 test_data/E1.bedGraph | sort | uniq -c 1813 NR_003278.3 4588 NR_003279.1 152 NR_003280.2 119 NR_030686.1

ferhatalkan commented 1 year ago

Seems like you mapped your reads to human rrnas not to mouse rrnas?

nayanvs commented 1 year ago

I select organism = "mm"

dripARF_result <- ARF::dripARF(samplesFile = "/Rdirectory/dripARF_results/sample2.tsv", organism ="mm", QCplot = TRUE, targetDir = "dripARF_results/" ) [1] "Reading bedgraph/bam file for sample TTX1 ( 1 of 6 )" [1] "14.65 % is done..." [1] "29.31 % is done..." [1] "43.96 % is done..." [1] "58.62 % is done..." [1] "73.27 % is done..." [1] "87.92 % is done..." [1] "Reading bedgraph/bam file for sample TTX2 ( 2 of 6 )" [1] "14.55 % is done..." [1] "29.1 % is done..." [1] "43.64 % is done..." [1] "58.19 % is done..." [1] "72.74 % is done..." [1] "87.29 % is done..." [1] "Reading bedgraph/bam file for sample TTX3 ( 3 of 6 )" [1] "14.55 % is done..." [1] "29.11 % is done..." [1] "43.66 % is done..." [1] "58.22 % is done..." [1] "72.77 % is done..." [1] "87.32 % is done..." [1] "Reading bedgraph/bam file for sample cLTP1 ( 4 of 6 )" [1] "14.59 % is done..." [1] "29.18 % is done..." [1] "43.77 % is done..." [1] "58.36 % is done..." [1] "72.95 % is done..." [1] "87.54 % is done..." [1] "Reading bedgraph/bam file for sample cLTP2 ( 5 of 6 )" [1] "14.55 % is done..." [1] "29.09 % is done..." [1] "43.64 % is done..." [1] "58.18 % is done..." [1] "72.73 % is done..." [1] "87.27 % is done..." [1] "Reading bedgraph/bam file for sample cLTP3 ( 6 of 6 )" [1] "14.56 % is done..." [1] "29.13 % is done..." [1] "43.69 % is done..." [1] "58.26 % is done..." [1] "72.82 % is done..." [1] "87.39 % is done..." [1] "ALL bedgraphs have been read." No id variables; using all as measure variables converting counts to integer mode estimating size factors estimating dispersions gene-wise dispersion estimates mean-dispersion relationship -- note: fitType='parametric', but the dispersion trend was not well captured by the function: y = a/x + b, and a local regression fit was automatically substituted. specify fitType='local' or 'mean' to avoid this message next time. final dispersion estimates fitting model and testing -- note: fitType='parametric', but the dispersion trend was not well captured by the function: y = a/x + b, and a local regression fit was automatically substituted. specify fitType='local' or 'mean' to avoid this message next time. [1] "Comparing TTX vs cLTP"

out of 6868 with nonzero total read count adjusted p-value < 0.05 LFC > 0.50 (up) : 0, 0% LFC < -0.50 (down) : 0, 0% outliers [1] : 0, 0% low counts [2] : 0, 0% (mean count < 121) [1] see 'cooksCutoff' argument of ?results [2] see 'independentFiltering' argument of ?results

NULL -- note: fitType='parametric', but the dispersion trend was not well captured by the function: y = a/x + b, and a local regression fit was automatically substituted. specify fitType='local' or 'mean' to avoid this message next time. -- note: fitType='parametric', but the dispersion trend was not well captured by the function: y = a/x + b, and a local regression fit was automatically substituted. specify fitType='local' or 'mean' to avoid this message next time. [1] "Running predictions for TTX vs cLTP" [1] "# GSEA measures Done!"

preparing geneSet collections... GSEA analysis... leading edge analysis... done... [1] "# GSEA RUN Done!" [1] "# Overrepresentation Done!" [1] "# GSEA report done!" [1] 1 <simpleError in dim(ordered) <- ns: dims [product 1] do not match the length of object [0]> Error in dripARF_result_heatmap(dripARF_results = dripARF_results, targetDir = targetDir, : object 'last' not found In addition: Warning messages: 1: Transformation introduced infinite values in continuous y-axis 2: Removed 18 rows containing non-finite values (stat_ydensity()). 3: Transformation introduced infinite values in continuous y-axis 4: Removed 18 rows containing non-finite values (stat_ydensity()). 5: In DESeqDataSet(se, design = design, ignoreRank) : some variables in design formula are characters, converting to factors 6: In fgseaMultilevel(pathways = pathways, stats = stats, minSize = minSize, : For some pathways, in reality P-values are less than 1e-10. You can set the eps argument to zero for better estimation. 7: Removed 1 rows containing missing values (geom_point()). 8: Removed 1 rows containing missing values (geom_point()).

ferhatalkan commented 1 year ago

Ah sorry, it is indeed mouse. I think I figured out what is going on. Your comparison probably doesnt have any significant results, (you might check the outputted GSEA report), which causes the downstream error when plotting the heatmap. If you need a temporary fix, change the "return(last)" to return(NULL) at line 576.

nayanvs commented 1 year ago

I see. Is there any way to add other organisms' rrnas to the list? I would like to give it a try with rat and Marmoset rRNA.

ferhatalkan commented 1 year ago

I'm currently working on including other organisms. Will be available sometime soon.