LadnerLab / PepSIRF

PepSIRF: Peptide-based Serological Immune Response Framework
GNU General Public License v3.0
9 stars 2 forks source link

Automatic library truncation when running demux #221

Closed jtladner closed 3 months ago

jtladner commented 9 months ago

Through the "-l [ --library ]" argument, the user provides a set of expected sequences to which the individual reads are compared.

Through the "--seq" argument, the user specifies the location and length of the individual reads that should be compared to the "library" sequences.

However, there is currently no check in place to ensure that the lengths of the --library and --seq sequences are the same, and if these are not the same, then it is impossible for matches to be found.

What I propose is that:

  1. We implement a check that will compare the lengths of the sequences provided/specified with these two arguments.
  2. If the --seq length is less than the --library length, we truncate the --library sequences to match the --seq length. This is expected to be commonly applicable because the sequencing reads don't always cover 100% of the DNA tag (--library).
  3. If the --seq length is greater than the --library length, terminate processing and issue a warning to the user. This is not an expected scenario and likely reflects user error.