Closed Didymos056 closed 2 years ago
Sorry I don't understand the issue. The console output seems to correspond to your print statement in the code. What did you expect to happen vs what actually did happen?
Sorry, my English is poor. What I want to express is that this mutation is printed because of this judgment when calculating the grammar, and this phenomenon should not be expected, right?
if (seq[aa_pos] != aa_orig):
print(mutation)
I think the main reason for this is simply skipping the count when the sequence contains "-", resulting in getting an error pos。
for ch1, ch2 in zip(alignment[0], alignment[1]):
if ch1 != ch2 and ch1 != '-' and ch2 != '-':
mutations.append('{}{}{}'.format(ch1, pos + 1, ch2))
elif ch1 == '-' and ch2 != '-':
mutations.append('{}ins{}'.format(pos + 1, ch2))
elif ch1 != '-' and ch2 == '-':
mutations.append('{}{}del'.format(ch1, pos + 1))
if ch1 != '-':
pos += 1
return mutations
Sorry it's still unclear to me -- this seems to be an issue with the input data? In that case, this may be the wrong place to ask this question..
NOTE: if this is not a bug report, please use the GitHub Discussions for support questions (How do I do X?), feature requests, ideas, showcasing new applications, etc.
Bug description I often see something like "N665K, F104X" in the console output when running code. Positioning found that when you count the mutation sites, the position corresponding to "-" will be skipped. But when the calculation grammatically changes, it will directly use seq[pos] to make assertions. Is this the cause of my problem? Whether mut_probs corresponding to pos should skip "-".
Reproduction steps
Expected behavior Give a clear and concise description of what you expected to happen.
Logs Please paste the command line output: `*****.py:80: RuntimeWarning: invalid value encountered in log10 return np.mean(np.log10(mut_probs)) R455L
R455L
I104T
A225V
T98I
N683K`
Additional context Add any other context about the problem here. (like proxy settings, network setup, overall goals, etc.)