iqbal-lab-org / pandora

Pan-genome inference and genotyping with long noisy or short accurate reads
MIT License
110 stars 14 forks source link

Index should check that there are no duplicate read names #132

Open mbhall88 opened 5 years ago

mbhall88 commented 5 years ago

My feeling is that if we see two sequences with the same name we should raise an error. A lot of other bioinf tools also exhibit this behaviour. I think it also saves bigger unexplained issues (like this one) arising further down the analysis road.

Originally posted by @mbhall88 in https://github.com/rmcolq/pandora/issues/126#issuecomment-492161063

mbhall88 commented 5 years ago

@leoisl given you are working on index at the moment maybe it makes sense for you to add this in?

leoisl commented 5 years ago

Yeah, I will take care of it after #23

leoisl commented 5 years ago

Update: I am not taking care of this right after #23

This has low priority, moving towards finishing learning pandora index, map and compare to try to make the most of the fact that Rachel is still here (without her, I am sure it would be way slower for me to understand index).