GiantSpaceRobot / tsRNAsearch

Nextflow DSL2 tRNA and ncRNA fragment identification pipeline for small RNA-seq data
MIT License
7 stars 3 forks source link

Pipeline throws an error when SAMcollapse.py does not return a tRNA group in group comparison analysis #9

Open SBaindoor opened 2 years ago

SBaindoor commented 2 years ago

Hi,

This issue occurs only in group comparison analysis. I have 3 files for control and 3 files for treatment which I’m running a group comparison analysis on. The SAMcollapse.py file (called in BAM_COLLAPSE module) does not return any matches for one of the files(counterGroup in the results for this file is 0), this results in the values being set to NA in Combined_case_tRNAs-almost-mapped_RPM.depth and throws an error in the DATA_TRANSFORMATIONS module when calculating mean(of the column with NA). Is it possible to handle the absence tRNAgroups in a file without throwing an error?

Thank you

SBaindoor commented 2 years ago

.command.err content:

Traceback (most recent call last): File "/mnt/raid1/tsRNAsearch/bin/MeanCalculator.py", line 24, in data = map(float, data) ValueError: could not convert string to float: NA Traceback (most recent call last): File "/mnt/raid1/tsRNAsearch/bin/MeanCalculator.py", line 24, in data = map(float, data) ValueError: could not convert string to float: NA Traceback (most recent call last): File "/mnt/raid1/tsRNAsearch/bin/Mean-to-RelativeDifference.py", line 23, in feature, mean1, mean2, std1, std2 = items[0], float(items[2]), float(items[6]), float(items[3]), float(items[7]) IndexError: list index out of range

Combined_T_tRNAs-almost-mapped_RPM.depth file contents, the last column(NA) is from the file that did not have any tRNA groups returned by SAMcollapse.py

image

GiantSpaceRobot commented 2 years ago

Hi there. I've just made a fix, please download the newest version of the pipeline (or just replace MeanCalculator.py) and try again. NA values are now replaced with zero prior to continuing with the pipeline. Let me know if this fixes the issue, thanks.

SBaindoor commented 2 years ago

Hi,

Thank you for replying so soon.

I updated the MeanCalculator.py and got an error in the FISHERS_METHOD part of the pipeline.

The .command.err contents are : Warning message: replacing previous import ‘vctrs::data_frame’ by ‘tibble::data_frame’ when loading ‘dplyr’

Attaching package: ‘dplyr’

The following objects are masked from ‘package:stats’:

filter, lag

The following objects are masked from ‘package:base’:

intersect, setdiff, setequal, union

Error in if (mean1 >= 1 || mean2 >= 1) { : missing value where TRUE/FALSE needed In addition: Warning message: In if (mapply.df == "Error") { : the condition has length > 1 and only the first element will be used Execution halted

image image

SBaindoor commented 2 years ago

I noticed that in SAMcollapse.py there was a macthing MT gene but it was not returned because in line 61 of the file if all(tRNAgroup in x for x in database_matches) == True condition was failing as it was trying to compare MT_TK to chrMT.tRNA18-MT_TK and chrMT.tRNA17-MT_TA (databasematches, for reference only) so I changed the code to only compare the MT part. I changed line 54 and 56 to tRNAgroup = (v[0].split("\t")[2].split("-")[1].split("")[0])+"\" This helped by returning a matched tRNAgroup.

SBaindoor commented 2 years ago

Hi,

Thank you for replying so soon.

I updated the MeanCalculator.py and got an error in the FISHERS_METHOD part of the pipeline.

The .command.err contents are : Warning message: replacing previous import ‘vctrs::data_frame’ by ‘tibble::data_frame’ when loading ‘dplyr’

Attaching package: ‘dplyr’

The following objects are masked from ‘package:stats’:

filter, lag

The following objects are masked from ‘package:base’:

intersect, setdiff, setequal, union

Error in if (mean1 >= 1 || mean2 >= 1) { : missing value where TRUE/FALSE needed In addition: Warning message: In if (mapply.df == "Error") { : the condition has length > 1 and only the first element will be used Execution halted

image image

Hi,

Please let me know if you need more information regarding the issue.

Thank you

GiantSpaceRobot commented 2 years ago

Looks like the same problem as before. I'll have to make a fix that corrects the NA values earlier in the pipeline. Unfortunately I won't be able to fix this in the next few days as I can only update this pipeline in my spare time. I'll get to it as soon as I can though, thanks!

SBaindoor commented 2 years ago

Thank you for replying, no problem at all