microbiomedata / nmdc-edge

Web-based interface to the NMDC EDGE platform
https://nmdc-edge.org
3 stars 0 forks source link

RQCfilter chastityfilter failed on data from SRA. #180

Closed chienchi closed 3 weeks ago

chienchi commented 4 weeks ago

Workflow Name

ReadsQC

Project URL https://nmdc-edge.org/admin/project?code=cNDAehUP8J3HTHx0

Additional Info Log:

the ReadQC in the metagenomes workflow keeps failing
Input is being processed as unpaired
Started output streams: 0.006 seconds.
java.lang.Exception: Strangely formatted read. Please disable chastityfilter with the flag chastityfilter=f. id:SRR10907120.10456145.2 10456145 length=151
    at shared.KillSwitch.kill(KillSwitch.java:96)
    at stream.Read.failsChastity(Read.java:1958)
    at stream.Read.failsChastity(Read.java:1932)
    at jgi.BBDuk$ProcessThread.run(BBDuk.java:2512)

If reads do not have correct illumina fastq header format, the chastityfilter should be turn off.

@HISEQ09:205:C6PFFANXX:2:1101:4014:1996  1:N:0:GGCTAC
mflynn-lanl commented 3 weeks ago

I will add the chastityfilter_flag boolean as an input to rqcfilter.wdl and to the template for the ReadsQC input.json

mflynn-lanl commented 3 weeks ago

I'm going to set it to false by default. @chienchi, @yxu-lanl can we add something to detect if the the sample is from the SRA so we can set the boolean to true if it is not?