In particular, the black magic around pandas data frame wrangling could use some polishing. IMO, we are not dealing with relevant amounts of data and explicit modeling with Python dataclasses or so could make things more transparent. If you are interested, I could contribute some patches that make things more explicit without dataframes as I have been working on reproducing the method without data frames (which is now obsolete, luckily, because of your V2 updates).
:wave: @hassansaei @berntpopp - the V2 update is greatly appreciated as it allowed me to run this successfully.
I wonder, however, whether the heuristics around here could be explained in a better fashion?
https://github.com/hassansaei/VNtyper/blob/37f88e94e38b4089d1b5b1a7a50891a10b926245/vntyper/scripts/kestrel_genotyping.py#L491
In particular, the black magic around pandas data frame wrangling could use some polishing. IMO, we are not dealing with relevant amounts of data and explicit modeling with Python dataclasses or so could make things more transparent. If you are interested, I could contribute some patches that make things more explicit without dataframes as I have been working on reproducing the method without data frames (which is now obsolete, luckily, because of your V2 updates).