SystemsGenetics / GEMmaker

A workflow for construction of Gene Expression count Matrices (GEMs). Useful for Differential Gene Expression (DGE) analysis and Gene Co-Expression Network (GCN) construction
https://gemmaker.readthedocs.io/en/latest/
MIT License
33 stars 16 forks source link

Edge Case Fastq Merge #259

Open JohnHadish opened 2 years ago

JohnHadish commented 2 years ago

Description of the bug

Edge Case, potentially can be considered as incorrect entering of data into NCBI. For project PRJNA605000 researchers used two technologies to produce 3 fastq files for each sample (1 NextSeq & 1 HiSeq with 2 lanes). This causes GEMmaker to fail during fastq merge since it is not expecting a NextSeq run in addition to the two HiSeq runs.

Directory:

SRR11029259_1.fastq
SRR11029259_2.fastq
SRR11029259_3.fastq
SRR11029260_1.fastq
SRR11029260_2.fastq
SRR11029260_3.fastq
SRR11029261_1.fastq
SRR11029261_2.fastq
SRR11029261_3.fastq
SRR11029262_1.fastq
SRR11029262_2.fastq
SRR11029262_3.fastq
SRR11029263_1.fastq
SRR11029263_2.fastq
SRR11029263_3.fastq
SRR11029265_1.fastq
SRR11029265_2.fastq
SRR11029265_3.fastq
SRX7681625_1.fastq
SRX7681625_2.fastq

Would be relatively easy to detect when this edge case happens. Probably not neccesary since it is very specific.

Command used and terminal output

No response

Relevant files

No response

System information

No response

JohnHadish commented 2 years ago

Same issue: PRJNA515437

JohnHadish commented 2 years ago

https://www.ncbi.nlm.nih.gov/bioproject/PRJNA473012