Closed peterjc closed 1 year ago
Note this example sequences one marker at a time.
Cross reference #425, we have no good way to specify expected species lists per sample AND per marker.
Given the current code, the example would need to call assess for each marker (with different marker files setup), and probably assess the pooled results separately again.
Why are we lacking any reference sequences for Laimaphelenchus penardi? The authors report recovering it for NF1-18Sr2b and D3Af-D3Br with an NCBI RefSeq reference sequence available.
Update: Can use EU306346.1 and AY593918.1 Laimaphelenchus penardi for NF1-18Sr2b (but gets no matches).
Authors say in S3 table they used: Laimaphelenchus KX580741.1, KX580740.1, KF881746.1 - these have the D3FA left primer, but not the D3BR right primer.
Update: Lowering threshold, get ASV with 227 copies matching KF998578.1 Laimaphelenchus deconincki in D3Af-D3Br, and ASV with 123 copies matching EU306346.1 Laimaphelenchus penardi in NF1-18Sr2b
Originally added in #347 prior to multi-marker support being added, this worked example is lacking a discussion of the classifier assessment.
https://thapbi-pict.readthedocs.io/en/latest/examples/soil_nematodes/index.html
Note currently we assess the controls against the same list of 23 species for all markers.
Appears the JB3-JB5GED marker is too narrow to cover all 23 species, and NF1-18Sr2b in particular has a false positive problem (apparently struggles at species level within Globodera, Steinernema, and Xiphinema).