brentp / somalier

fast sample-swap and relatedness checks on BAMs/CRAMs/VCFs/GVCFs... "like damn that is one smart wine guy"
MIT License
262 stars 35 forks source link

Re-use of inferred parent names in large families #119

Open Awtum opened 1 year ago

Awtum commented 1 year ago

In cases of large families (in this case merged from many smaller families during infer), inferred parents between sets of siblings appear to be re-used. This gives the appearance of many children of the same parents, when in reality it is multiple sets of siblings consisting of 2 or 3 each that may be distantly related.

brentp commented 1 year ago

Yes, --infer tries to do simple things to do the best it can, but in low-quality or low-coverage samples, it can make many mistakes.