dereneaton / ipyrad

Interactive assembly and analysis of RAD-seq data sets
http://ipyrad.readthedocs.io
GNU General Public License v3.0
70 stars 39 forks source link

Problem with demultiplexing of 3RAD dataset #424

Closed ludmilasromek closed 3 years ago

ludmilasromek commented 3 years ago

Hello all,

I think there is a problem with demultiplexing step in ipyrad where there are two inline barcodes. I don't know what's going on, but I got very diffrent numbers of reads for some of the barcodes when I compared results from ipyrad and process_radtags from Stack, e.g ipyrad: sample_name total_reads EY_164 6558 sample_name true_bar obs_bar N_records EY_164 AGCGTTGAT+TCGGTACC AGCGTTGAT+TCGGTACC 46 EY_164 AGCGTTGAT+TCGGTACC AGCGTTGAT+TAGGTACC 804 EY_164 AGCGTTGAT+TCGGTACC AGCGTTGAT+TCGATACC 697 EY_164 AGCGTTGAT+TCGGTACC AGCGTTGAT+TCGTTACC 490 EY_164 AGCGTTGAT+TCGGTACC AGCGTTGAT+GCGGTACC 1 EY_164 AGCGTTGAT+TCGGTACC AGCGTTGAT+TTGGTACC 497 EY_164 AGCGTTGAT+TCGGTACC AGCGTTGAT+TCTGTACC 424 EY_164 AGCGTTGAT+TCGGTACC AGCGTTGAT+TCAGTACC 2535 process_radtags: Barcode Filename Total NoRadTag LowQuality Retained AGCGTTGAT-TCGGTACC EY_164 3131244 17825 922 3112497

It was ipyrad-0.9.55: datatype: pair3rad, 1 mismatch in barcode. Stack command: process_radtags -P -p ./raw_reads_Plate01 -b ./barcodes_plate01_stack.txt -o ./demultiplexed_stack_plate01/ -c -q -r --inline_inline --renz_1 mspI --renz_2 bamHI

isaacovercast commented 3 years ago

I see, so the ipyrad results show the mismatched barcode reads. It looks like it's allowing 2 mismatches. Is that all the reads you see for that sample? Can you email me the full s1 results file? In practice allowing mismatches recovers very little data, as you can see, so i typically don't recommend it unless you're doing damage control and every little bit counts.

ludmilasromek commented 3 years ago

Yes, that's all reads for this sample. I sent you by e-mail full results files.

isaacovercast commented 3 years ago

I never received an email. Can you try again?

isaacovercast commented 3 years ago

Ok, well here's a weird clue. For the ipyrad demux process ALL the samples with barcode 2 sequence TCGGTACC failed in this same way (109, 110, 133, 134, 142, 164, 211, & 258) with very very few reads recovered. This is highly suspicous, but I haven't figured out the problem yet. Never seen this before....

ludmilasromek commented 3 years ago

I think that something is going wrong with entire demultiplexing step, not only with one barcode - when I looked on final results I recovered my three species, but there is too many hybrid individual to be likely (both when I allow and do not allow 1 mismatch in barcode).

isaacovercast commented 3 years ago

Did you ever make progress on this? Can you try updating to the most recent version of ipyrad and running this again?

isaacovercast commented 3 years ago

Fixed in 3b6dbea