pachterlab / seqspec

machine-readable file format for genomic library sequence and structure
MIT License
111 stars 17 forks source link

seqspec check too strict for corner case #51

Open hitz opened 2 hours ago

hitz commented 2 hours ago

@sidwekhande has an example seqspec: e.g. (https://api.data.igvf.org/configuration-files/IGVFFI0714JZHN/@@download/IGVFFI0714JZHN.yaml.gz but auth required) where the fastq has be demuxed fastqs, our and do not have a truseq_read1 and truseq_read2. However, seqspec starts labeling the fastq sequence starting from the truseq region.

So thesr are submitted truseq regions with the length set to 0, and sequence as null. This triggers:

error 8] None is not of type 'string' in spec['library_spec'][0]['regions'][0]['sequence'] [error 9] None is not of type 'string' in spec['library_spec'][0]['regions'][2]['sequence']

Proposed fix: seqspec check should account for len=0 and ignore this error.

sbooeshaghi commented 2 hours ago

Can you post the seqspec file here so that I can test it? I thought I had fixed this previously..

hitz commented 2 hours ago

There are other errors as well: IGVFFI0714JZHN-upgrade3.0.yaml.txt