hyunhwan-jeong / SalmonTE

SalmonTE is an ultra-Fast and Scalable Quantification Pipeline of Transpose Element (TE) Abundances
GNU General Public License v3.0
81 stars 23 forks source link

SalmonTE test condition problem? #41

Closed savytskanatalia closed 4 years ago

savytskanatalia commented 4 years ago

Good evening! I am running into problem with SalmonTE test.

Step 1: Loading required libraries... Step 2: Loading input data... Step 3: Running the DE analysis... Error in $<-.data.frame(*tmp*, "condition", value = integer(0)) : replacement has 0 rows, data has 6 Calls: SalmonTE -> $<- -> $<-.data.frame Execution halted

I used command SalmonTE.py test --inpath salmonte --outpath salmonte_out --tabletype csv --analysis_type DE --conditions=control,condition

And my condition.csv is edited to be:

SampleID,control sample_04,condition sample_03,control sample_06,condition sample_01,control sample_05,condition sample_02,control

The head of my EXPR.csv is:

TE,sample_04,sample_03,sample_06,sample_01,sample_05,sample_02 B1,401146.0,389164.0,400909.0,391824.0,407338.0,395345.0

The MAPPING_INFO.csv :

SampleID,num_mapped,num_processed,percent_mapped sample_04,5611368,83734667,6.701367785937454 sample_03,5461187,83418478,6.546735364795316 sample_06,5618886,83767436,6.707721124471328 sample_01,5458106,83385507,6.545629086359097 sample_05,5615023,83732805,6.705881882256303 sample_02,5457687,83411724,6.543069413119911

My issue seems to be same or similar to the one in Issue #39 .... SalmonTE version I use is SalmonTE 0.4

Thank you in advance for your reply!

Best wishes, Natalia.

hyunhwan-jeong commented 4 years ago

@savytskanatalia can you change the first row of condition.csv to SampleID,condition?

Thank you,

Hyun-Hwan Jeong

savytskanatalia commented 4 years ago

@hyunhwaj yes, I tried it. When I do this, the run proceeds a bit further, but execution is still halted:

Step 1: Loading required libraries... Step 2: Loading input data... Step 3: Running the DE analysis... converting counts to integer mode estimating size factors estimating dispersions gene-wise dispersion estimates mean-dispersion relationship -- note: fitType='parametric', but the dispersion trend was not well captured by the function: y = a/x + b, and a local regression fit was automatically substituted. specify fitType='local' or 'mean' to avoid this message next time. Error in lfproc(x, y, weights = weights, cens = cens, base = base, geth = geth, : newsplit: out of vertex space Calls: SalmonTE ... estimateDispersionsFit -> localDispersionFit -> locfit -> lfproc -> .C In addition: There were 17 warnings (use warnings() to see them) Execution halted

hyunhwan-jeong commented 4 years ago

@savytskanatalia, It seems that there is a problem with DESeq2, but I can't replicate the problem. If you don't mind sharing the data, can you share your data with hyunhwaj_at_bcm.edu so that I can figure out the problem?

Thank you,

Hyun-Hwan Jeong

savytskanatalia commented 4 years ago

@hyunhwaj , it was a mistake on my side - I took as test input TPM expression file instead of counts. Re-running quantification step with counts as exprtype and using it for input in DESeq2 resolves the issue.

I will mark the issue as a resolved, because the initial technical issue was resolved by your kind suggestion! and the second error was due to my mistake.

Thank you for your help again!