arendsee / rnaseq-pipeline

A RNA-seq pipeline using Kallisto and bioconductor::SRAdb
0 stars 0 forks source link

Check if the reads are paired end while performing counts #1

Open aseetharam opened 7 years ago

aseetharam commented 7 years ago

I encountered few cases where some of the Runs within a study had mixed type of reads (most runs were paired-end but couple of them were single-end). The script didn't check if there were 2 files and just calculated the abundance as usual! Can you make a check to see if the reads are paired-end while mapping as well?

arendsee commented 7 years ago

If you tell me the sample ids for the cases that don't work (and one or two for cases that do work) I can add handling for this.

I may set up a little test suite.

aseetharam commented 7 years ago

Here are the IDs that fail:

SRR1574689
SRR1575179
SRR2890187
SRR700533

Here are the ID's that works:

SRR978408
SRR978411
SRR978413
SRR978416
arendsee commented 7 years ago

Thanks, I'll work on a solution.

aseetharam commented 7 years ago

one option is to use array, instead of ${tmpdir}/${runid}_* to provide inputs for kallisto for eg:

reads=(${tmpdir}/${runid}_*)

now you can check how many elements are in ${reads[@]} before you run kallisto

if [ "${#reads}" -eq 2 ]; then
    echo "run kallisto"
fi

this might work!