MonashBioinformaticsPlatform / RNAsik-pipe

RNAsik - more than just a pipeline
https://monashbioinformaticsplatform.github.io/RNAsik-pipe/
Apache License 2.0
13 stars 5 forks source link

error in the pipeline 1.5.0, duplicated gene names in geneID file, and naming error for fastqc report generation #24

Closed methylnick closed 6 years ago

methylnick commented 6 years ago

Have a look a the RNASik output:

http://bioinformatics.erc.monash.edu/home/nick-wong/projects/evelyn.tsantikos/RNAsik.bds.20180501_215237_833.report.html

Seems the reverse strand feature count stage errored our because of duplicate gene names in the gene table. Using iGenomes mm10 UCSC gene annotations.

Second is the prefix extraction, erroring out for fastqc generation, the fastqcs were generated, with the original fastq file names rather than the "truncated" file name when you set -extn flag with more than just fastq.gz, in this case _R1_001.fastq.gz, single end reads experiment here.

serine commented 6 years ago

With the first error

Exception: Epb4 gene Id is already in the dictionary, duplicated gene name

I'd grep for Epb4 in your annotation file. It appears to be a duplicated gene name, which isn't uncommon

The second, fastqc error. I think this is to do with this options -extn _R1_001.fastq.gz -extn meant only for file extensions, you are giving it too much. it should be -extn fastq.gz which is default anyway.

Related to your previous issue, if your fastq files have different extension to that, i.e .fastq or .fq.gz or .txt.gz then you can use -extn to specify extension type. I'm not too sure why you had _R1 there

Let me know how it goes

Cheers

methylnick commented 6 years ago

Thanks Kirill, for the duplicated gene, it is a first time I came across this using 1.5.0 for the first time, didn't throw this error in 1.4.7 or 1.4.8 for the files I used (iGenomes).

As for the suffix issue. the _R1 is still there, even with single end reads. it's an illumina thing. Will keep playing with it and seeing.

Thanks for responding!

serine commented 6 years ago

The suffix doesn't matter, just don't include it in -extn that's all.

Let me know

Cheers

serine commented 6 years ago

@methylnick closing this issue now. I think it's been resolved? Re-open it if you need more clarifications

Cheers