kevlar-dev / kevlar

Reference-free variant discovery in large eukaryotic genomes
https://kevlar.readthedocs.io
MIT License
40 stars 9 forks source link

Controls for simlike #380

Closed 17tranap closed 4 years ago

17tranap commented 4 years ago

I am trying to run simlike with only one control. I am working with plant genomes, so I only have one parent for the control, and I am getting an index out of bounds error for the abundances. I thought it might be expecting two parents, so I input the same parent counttable twice, but still got the same error.

[kevlar::simlike] Computing likelihood scores for preliminary variant calls
Traceback (most recent call last):
  File "/home/miniconda3/bin/kevlar", line 11, in <module>
    load_entry_point('biokevlar', 'console_scripts', 'kevlar')()
  File "/tools/kevlar/kevlar/__main__.py", line 30, in main
    mainmethod(args)
  File "/tools/kevlar/kevlar/simlike.py", line 383, in main
    for call in calculator:
  File "/tools/kevlar/kevlar/simlike.py", line 332, in simlike
    calc_likescore(call, altabund, refrabund, mu, sigma, epsilon)
  File "/tools/kevlar/kevlar/simlike.py", line 203, in calc_likescore
    error=epsilon)
  File "/tools/kevlar/kevlar/simlike.py", line 137, in likelihood_denovo
    assert len(abunds[2]) == len(refrabunds), (len(abunds[2]), len(refrabunds))
IndexError: list index out of range
standage commented 4 years ago

Unfortunately, the instance of the model implemented by kevlar simlike is pretty specific to autosomal inheritance in diploid organisms. The model itself is generalizable, but I wouldn't expect this last step in the kevlar workflow to work well on plant data out-of-the-box. I'd be happy to advise on implementation and tuning, but wouldn't have much time myself to devote to it at the moment.

17tranap commented 4 years ago

The plant I am working with is a diploid plant whose seeds were mutagenized, grown, and then mated back with the parent. I would be interested in the results of simlike, even if it is not tuned well for my use case, so I am wondering at the source of the error that I attached. Please let me know if you have any ideas and I will try to resolve the code myself. I'd be happy to contribute as a branch in the future.