matsengrp / antigen

Simulating virus evolution and epidemiology
http://bedford.io/projects/antigen/
1 stars 0 forks source link

Random epitope mutations #24

Closed thienktran closed 1 year ago

thienktran commented 1 year ago

Description

I created an inner class called MutationVector in Biology.java, because

  1. We can store vector attributes in fields, allowing us to return MutationVector instead of a double[]. double[] requires us to remember how attributes are arbitrarily indexed.
  2. In some cases, we need to know the direction and magnitude in addition to mutA and mutB, so having fields named theta and r keeps things more organized.

MutationVector contains calculateMutation(), a function that computes how a mutation affects the location of a virus in antigenic space. This function is used to create predefined vectors in Biology.java or each time mutate() is called.

predefinedVectors is a new parameter that determines if the simulation should use random or predefined mutation vectors. It is used by a conditional statement in mutate() to determine if a vector can be retrieved or has to be freshly computed.

Lastly, I rearranged the order of some logic in mutate(). Synonymous mutations don't require updates to the nucleotide sequence, so that means we can return early and save ourselves some unnecessary computations. The virus doesn't move in antigenic space, so there is no need to calculate or determine the mutation vector.

Tests

In addition to running the tests from #17 when predefinedVectors: true and predefinedVectors: false, I created a PrintStream variable to create a CSV file of vectors randomly calculated to represent antigenic effect of mutations in epitope sites for a single simulation. Using 'testGammaDistribution.py', we get:

Screen Shot 2023-03-28 at 10 16 40 PM

Here is what happens if vectors in non-epitope sites are included in the CSV: Screen Shot 2023-03-28 at 9 57 17 PM

(The orange line is the distribution used in the original Antigen and is used for reference).

The parameters used when predefinedVectors: false can be found in this commit's parameters.yml. I can only simulate up to 1000 days before running out of memory, so the trees produced aren't very informative.

Screen Shot 2023-03-28 at 9 53 29 PM

Checklist:

Haddox commented 1 year ago

@thienktran: looks good!

One question: above you said "Synonymous mutations don't require updates to the nucleotide sequence". This isn't correct as synonymous mutations do require updating the nucleotide sequence. Though, they do not require updating the virus's location in antigenic space. Just wanted to make sure that synonymous mutations are still changing the virus's sequence in context of the updates you made. Thanks!

thienktran commented 1 year ago

That's a really good catch, @Haddox!! I will move the code that updates the nucleotide sequence up, so that both returns are accurate. Thank you for catching that!

Edit: It's been fixed now!

Haddox commented 1 year ago

Thanks Thien!

thienktran commented 1 year ago

@Haddox Yes, I was able to fix it! When predefinedVectors: false, I forgot to update the location from the current position. Changing this gave me the same results as what we saw earlier.

Haddox commented 1 year ago

Great! Sounds good.