phetsims / natural-selection

"Natural Selection" is an educational simulation in HTML5, by PhET Interactive Simulations
GNU General Public License v3.0
3 stars 7 forks source link

specify the initial population of bunnies #9

Closed pixelzoom closed 4 years ago

pixelzoom commented 4 years ago

Client requirements for Natural Selection PhET-iO include the ability to specify the initial population of bunnies.

The description of the initial population includes:

The description should be specifiable via query parameters, to support development, testing, and general PhET users.

If Studio eventually provides a UI for query parameters (see https://github.com/phetsims/joist/issues/576) then the query parameter API may be sufficient. But it's also possible that a programmatic PhET-iO API (IOtype with setters) may be needed.

The initial population must be restored when Reset All is pressed, without requiring a custom Reset All listener (see https://github.com/phetsims/tandem/issues/123).

pixelzoom commented 4 years ago

Additional details, a crash course in genetics:

pixelzoom commented 4 years ago

Initial thoughts on query parameters API. Developed by design team in 11/5/19 meeting, refined by me, moved here from the Natural Selection PhET-iO design doc. (Note that this doesn't address the "per screen" requirement yet)

3 query parameters:

Examples:

?mutations=F&genotypes=FF,Ff,ff&population=25,50,25

?mutations=F,e&genotypes=FFEe,FfEe,ffee&population=10,10,10

?mutations=F,e,T&genotypes=FFEeTT,FfEeTt&population=15,15

?population=50

pixelzoom commented 4 years ago

Mentioning @samreid @zepumph and @chrisklus so that they are aware of this. I'd like to discuss (1) whether a query-parameter API will be sufficient, and (2) whether initial population can be automatically restored for Reset All.

pixelzoom commented 4 years ago

11/6/19 phet-io dev meeting:

@zepumph suggested having a Property, something like {Property.<PopulationDescription>} initialPopulationDescription. Setting it would clear the PhetioGroup for bunnies and create new bunnies for the initial population. Then this would be handled like any other Property wrt Reset All.

Questions: What is PopulationDescription and how do clients set it via Studio? Is it text like query parameters? @zepumph suggested that we make need to write a custom UI in Studio.

Another alternative that was rejected in 11/5/19 design meeting but @samreid proposed again today was to combine genotype and population into one query parameter. E.g.:

?mutations=F&population=25:FF,50:Ff,25:ff

For usability, it would nice if we could condense to 1 query parameter.

pixelzoom commented 4 years ago

Unassigning until I start this aspect of implementation.

pixelzoom commented 4 years ago

A couple of additional considerations for this feature:

pixelzoom commented 4 years ago

A couple of details that we haven't discussed:

12/11/19 DECISION: Yes, ?population=25.

12/11/19 DECISION: No, query parameter values will not be translatable.

@amanda-phet @kathy-phet your thoughts on the above 2 items?

amanda-phet commented 4 years ago

Is there any need to specify an initial population that contains no mutations? In such a population, the ?mutations query parameter would be irrelevant.

I can't see why someone would need to do that, but @kathy-phet should comment.

The allele abbreviations (F,f,E,e,T,t) are translatable. Should the values for the ?mutations query parameter use the translated abbreviations, or English? I don't know if the former is possible. The latter may make this feature complicated for non-English users.

I have no idea... but it seems like the query parameter should be able to work with the characters that match the locale.

kathy-phet commented 4 years ago

Is there any need to specify an initial population that contains no mutations? In such a population, the ?mutations query parameter would be irrelevant.

I can't see why someone would need to do that, but @kathy-phet should comment.

I can actually see that this would be needed, so that the student still gets to add a mutation and select which is dominant and which is recessive and then run forward from there. I think we should support this mode.

The allele abbreviations (F,f,E,e,T,t) are translatable. Should the values for the ?mutations query parameter use the translated abbreviations, or English? I don't know if the former is possible. The latter may make this feature complicated for non-English users.

I have no idea... but it seems like the query parameter should be able to work with the characters that match the locale.

If this adds significant complexity/cost, I just don't think we can support it at this time. We have no idea what characters these would be. A teacher tips document in another language can specify what is needed for a teacher to use the query parameters.

pixelzoom commented 4 years ago

@kathy-phet said:

I can actually see that this would be needed ...

Can you elaborate? Why is it useful to start with (e.g.) 25 bunnies with no mutations, vs just 2 bunnies with no mutations? And if you want this feature, would it be ?population=25, with no allele abbreviations specified?

If this adds significant complexity/cost, I just don't think we can support it at this time. We have no idea what characters these would be. ...

No idea what the complexity/cost is until I get fairly deep into it. And indeed, we don't know what the chars would be.

pixelzoom commented 4 years ago

After a brief investigation... It is possible to read translated query parameter values. But...

The complications this would introduce are:

(1) Parsing the values to extract allele abbreviations will be trickier -- different char sets, left-to-right, etc.

(2) QA testing will be more involved/expensive -- different char sets, left-to-right, etc.

(3) If the client uses these query parameters in a wrapper, their values will need to match the locale of the sim. It's won't be sufficient to just set ?locale. The wrapper will need knowledge of the allele abbreviation for that locale. And if it hardcodes them, it may become out of sync with the locale's string file.

(4) Anything related to possible query-parameter interface in Studio?

pixelzoom commented 4 years ago

When an initial population is specified, is it considered to be Generation 0?

amanda-phet commented 4 years ago

That's a good question. I was thinking generation 1 at first, since that's the moment when there are enough bunnies to mate, but generation 0 probably makes more sense in this case.

A question that comes to mind- if the client specifies an initial population ≥2, should the "add a mate" button be hidden? It doesn't make much sense to have it if there are already plenty of bunnies that can mate.

pixelzoom commented 4 years ago

That's a good question. I was thinking generation 1 at first, since that's the moment when there are enough bunnies to mate, but generation 0 probably makes more sense in this case.

Generation 0 makes sense to me. And if it's not generation 0, what do we show between generation 0 and 1 for the Population graph graph, and for generation 0 for the Proportions graph?

A question that comes to mind- if the client specifies an initial population ≥2, should the "add a mate" button be hidden? It doesn't make much sense to have it if there are already plenty of bunnies that can mate.

The "Add a Mate" button is necessary only if there is 1 bunny. If a population > 1 is specified, then when the sim starts, the generation clock will start immediately.

kathy-phet commented 4 years ago

Can you elaborate? Why is it useful to start with (e.g.) 25 bunnies with no mutations, vs just 2 bunnies with no mutations? And if you want this feature, would it be ?population=25, with no allele abbreviations specified?

So I envision teachers and clients wanting to set the initial population of bunnies (so say 50) so the sim populates with that amount of bunnies (and yes thinking this would be generation 0), then they start mating. ( If they use this approach, we may want to have a different form of "add a mate" button that would start the generation clock. )

The reasoning is that mutations enter a population that already exists, and that mutation can be either dominant or recessive. This setting allows the client to set that intial population reasonably high without yet introducing the mutation. Then students are responsible for introducing the mutation and seeing how it spreads/grows in the existing population (if it is favorable).

Yes, ?population=25 seems like a reasonable interface. No allele definition is correct.

kathy-phet commented 4 years ago

The "Add a Mate" button is necessary only if there is 1 bunny. If a population > 1 is specified, then when the sim starts, the generation clock will start immediately.

See my comment above. Instead of hiding it maybe change it to "Begin mating" (maybe too explicit) or "Start Generation Clock". So students have a bit of time to process the interface before the bunnies start doing their thing.

kathy-phet commented 4 years ago

After a brief investigation... It is possible to read translated query parameter values. But... The complications this would introduce are:

I vote we just do English on the query parameters. We don't want to set any expectations too for other query parameters. We don't use translations of "screens=". Clients will want to use this across languages, and will not want to have to change the wrapper to do so other than just have the sim in the other language.

pixelzoom commented 4 years ago

We don't use translations of "screens=".

To clarify... We're not talking about translating query parameter keys ("screens"), we're talking about translating query parameter values. And the value for "screens" is an array of numbers.

pixelzoom commented 4 years ago

The "Add a Mate" button is necessary only if there is 1 bunny. If a population > 1 is specified, then when the sim starts, the generation clock will start immediately. See my comment above. Instead of hiding it maybe change it to "Begin mating" (maybe too explicit) or "Start Generation Clock". So students have a bit of time to process the interface before the bunnies start doing their thing.

So if an initial population > 1 is specified, require the user to push a button to start the generation clock? That's doable, but this feature sure is getting complicated/expensive.

As for the button label... "Generation Clock" is never identified as such in the UI -- do we really want to introduce that term? Maybe something less specific, like "Begin Simulation", "Start", or "Play"? (Reminder that we already have a "Play Again" button.)

pixelzoom commented 4 years ago

12/11/19 design meeting @amanda-phet @ariel-phet @kathy-phet @pixelzoom

Move "Play Again" button to location of "Add a Mate" button. "Play Again" goes back to whatever Generation 0 was when the sim started. Sim plays immediately.

"Play" button appears only when the sim is loaded, if population is > 1. It appears where the "Add a Mate" button would have been, and instead of the "Add a Mate" button.

pixelzoom commented 4 years ago

Summary of what needs to be done:

Unassigning until I get to this.

pixelzoom commented 4 years ago

I got partway through implementing this feature (completed the parser for the query parameters) and a couple of potential problems occurred to me:

(1) If a mutation (e.g. brown fur) is present in the initial population, can it really be considered to be a mutation? A mutation is an alteration of something that came before it. In the initial population (generation zero), there is nothing that came before it. If it's OK to still call it a mutation, great! If not, then this entire feature seems problematic.

(2) If bunnies in the initial population have alleles that we consider mutant (e.g. brown fur) should those bunnies be labeled with the "mutant" symbol in the Pedigree graph?

And related to question (2)...

If brown fur is recessive, and the initial population consists of 10 bunnies with genotype 'Ff', they contain the mutant allele, but their phenotype (appearance) is white fur. Since their genotype contains the mutant allele, should they be labeled with the "mutant" symbol in the Pedigree graph? Or should they be labeled with the "mutant" symbol only if the mutation appears in their phenotype? Either way, won't this be confusing?

When running the sim, we would never have a "first mutant" bunny that is "Ff" or "fF". When a mutation is first introduced, both of a bunny's alleles are set to the mutant allele, so that the mutation appears in the phenotype. (Java version did the same.) So a "first mutant" bunny will always be "FF" or "ff", depending on whether brown fur is dominant or recessive, and the phenotype would therefore always be brown fur.

@amanda-phet is going to consult with Megan. See https://github.com/phetsims/natural-selection/issues/79

kathy-phet commented 4 years ago

I would suggest not specifying any specific mutant bunny, if the population is pre-specified. I don't think we can label any particular bunny with that - since there is no specific time when the mutation was applied.

pixelzoom commented 4 years ago

@kathy-phet That does seem like the most reasonable answer to question (2) above. But I think we also need to be sure that it's appropriate to call it a mutation if it's present in the initial population (question 1 above). A mutation is an alteration of something that came before it. In the initial population (generation zero), there is nothing that came before it.

kathy-phet commented 4 years ago

I'm OK with still calling it a mutation - that happened at some time prior to the start of the sim going. It's a powerful feature to be able to set the state of the starting population.

amanda-phet commented 4 years ago

@kathy-phet that's what I said too, but sent Megan an email explaining the predicament to make sure that solution isn't a pedagogical problem.

pixelzoom commented 4 years ago

5/14/2020 design meeting @amanda-phet @kathy-phet @ariel-phet

Consensus was that I should proceed by not labeling bunnies in the initial population as "original mutants" (no mutation icon in the Pedigree graph).

pixelzoom commented 4 years ago

The format of these query parameters is evolving as I'm implementing, so a quick summary of where things are at...

To distinguish which screen the initial population is for, there is a pair of query parameters for each screen:

I will refer to these query parameters as "mutations" and "population".

If mutations is omitted, then population must be a positive integer (defaults to 1), e.g. labPopulation=10.

mutations is now a string, not an array of strings. I found myself accidentally omitting the commas in labMutations=F,e,t, so labMutations=Fet seemed friendlier.

population is still an array of strings. But I found myself accidentally omitting the colon separator in labPopulation=10:FfEEtt,5:ffeeTT, so labPopulation=10FFEEff,5ffeeTT seemed friendlier.

The parser and verifier for the query parameters is done. Still to do: creating the bunnies, reporting query parameter errors via a dialog, and falling back to defaults if query parameters are incorrect.

pixelzoom commented 4 years ago

@amanda-phet @ariel-phet @kathy-phet This feature has been completed and is ready for review. This is relatively complicated code, so it would be nice to have it reviewed while it's fresh in my mind. You can review in master or 1.0.0-dev.40.

See NaturalSelectionQueryParameters for documentation and examples of these query parameters:

// Intro screen
introMutations
introPopulation

// Lab screen
labMutations
labPopulation

If these query parameters have incorrect/incompatible values, then both of the related query parameters are shown in the "Invalid Query Parameters" dialog, and the sim falls back to the default (1 bunny).

For example, with ?labMutations=Fe&labPopulation=5Ff:

screenshot_319

There is additional detail about the error that is printed to the browser console. But the "Invalid Query Parameters" dialog (which is part of common code) only shows the information above. As you may recall from https://github.com/phetsims/joist/issues/593, the decision was to limit the information in this dialog. @amanda-phet summarized nicely in https://github.com/phetsims/joist/issues/593#issuecomment-578917716:

... My understanding is that we want to create a more robust solution later and will put more effort in to making it user friendly at that time. If that is all correct, then this looks like a fine solution for now.

pixelzoom commented 4 years ago

There's a known problem in 1.0.0-dev.40 (fixed in master). After pressing the "Start Over" or "Reset All" button, the initial mutations are not restored, and the first time the bunnies mate there will be an assertion failure. But you can still use this 1.0.0-dev.40 to evaluate the query parameters for setting up the initial state. Or if you prefer, use master with the fix.

amanda-phet commented 4 years ago

The query parameters seem to be working, and I am not having a hard time entering in values (I experimented with different mutations and also adding lots of genotypes), so I'd say great job on those.

(1) In https://github.com/phetsims/natural-selection/blob/master/js/common/NaturalSelectionQueryParameters.js you listed examples, and I was going to correct you on some. You included introMutations=FeT&introPopulation=5FFeETt,5ffeett as valid, but the Intro screen currently only allows for one mutation (fur). I tried it out, and it works, but it seems really strange that we could have bunnies with long teeth in the Intro screen! Should we do anything about this case? If a PhET-iO client modifies the mutations I can see why you need to allow those parameters, but for a regular user, that seems like it should prompt the error dialog. My recommendation is to leave as is, and address in teacher tips (or wherever instructions for query parameters will live).

(2) I know that in https://github.com/phetsims/joist/issues/593 we decided the user must figure out what is wrong with their invalid parameters, and when some are dependent on others they will both show up. However, I am finding it confusing here because only the second parameter has a mistake, and I expected to only see labPopulation appear in the dialog. It is likely too late to change this, but I wanted to share that when actually using it in practice I wish it only showed the incorrect one, not both. However, I tested with a third (correct) parameter and it worked well that it didn't show that one, so I think it's all probably fine. image

pixelzoom commented 4 years ago

Re (1)... It sounds like we need to have a discussion about this one. Visibility of genes (in either screen) can be changed after startup via PhET-iO. And query parameters do not distinguish between a PhET-iO user and a "regular" user. If introMutations were limited to 'F' or 'f', then you would never be able to set an initial population based on anything other than Fur, no matter how you have the sim configured for PhET-iO (where you might even hide Fur!). Similarly, there's no way to prevent a PhET-iO client from setting up an initial population like labMutations=FeT&labPopulation=10FfEEtt, and then hiding Fur.

Re (2)... If either query parameter is incorrect, then they must both be ignored and reverted to their defaults. And you are deducing that '10FFee' is the incorrect part. That's very difficult (and sometimes dangerous) to try to do programmatically. Playing devil's advocate, I could argue that '10FFee' is the correct part, and that the other parts are incorrect. Or if the query parameters were labMutations=Fet&labPopulation=10FFee, how do you know which one is incorrect? My point is that trying to write a program to guess the user's intent is not appropriate here, and it's better to just show both query parameters (since they will both be ignored) and let the user decide where the error is.

pixelzoom commented 4 years ago

I've moved the examples in NaturalSelectionQueryParameters.js to labMutations and labPopulation, where they should make more sense. And related to (1) above, I've added this note to labMutations:

  // NOTE: PhET-iO allows you to show/hide any of the 3 genes in both screens. It is up to the user to specify
  // only the genes that are visible for the screen. For example, the sim will happily accept 'labMutations=FeT',
  // then allow you to hide Fur in the Lab screen.  Or it will accept 'introMutations=T' and assume that PhET-iO
  // will be making Teeth visible in the Intro screen.
amanda-phet commented 4 years ago

Adding that note is a good idea, and seems adequate to me in addressing (1).

Also, I see your point about (2), so I don't think anything needs to change.

ariel-phet commented 4 years ago

I will look at the query parameters in more detail, but as to https://github.com/phetsims/natural-selection/issues/9#issuecomment-631041347

Perhaps we could say in the dialog "one or more of the Query Parameters listed below has invalid values" so it is perhaps more clear that we are just listing all the parameters, and at least one has a mistake? I agree we should not try to suss out user intent.

pixelzoom commented 4 years ago

The dialog is common code, so the proposal to change the wording deserves a common-code issue. And there are questions. See https://github.com/phetsims/joist/issues/635.

kathy-phet commented 4 years ago

I played around with setting the initial population and watching the mating dynamics of the phenotypes. I think the science looks correct - and made me rethink things, but it makes sense. :) I sent if off to Megan to sanity check just the initial population setting and mating evolution of the phenotypes.

amanda-phet commented 4 years ago

@kathy-phet what is the question that you asked Megan? I don't understand.

I did hear back from Megan about the other questions we had, and will post the emails in the design doc.

  1. It would be more accurate to replace the "Add Mutations" panel title with "Traits", but she said "if that goes beyond than just cutting-and-pasting terms into a program, it probably isn’t worth the hassle."

  2. It is just fine that we aren't labeling the mutant bunnies at mutants in the case that generation 0 has a starting population that includes mutants.

@pixelzoom edit: The discussion about these questions and Megan's responses is taking place in https://github.com/phetsims/natural-selection/issues/79.

kathy-phet commented 4 years ago

Just asked her to play around with setting the initial state of the genetics of the bunny populations, and make sure phenotypes were evolving correctly (without any selection pressure). To have her expert eye on that part of the sim.

If you had time, I wanted to have your sanity check on some of the numbers in the natural selection sim - to make sure the data jives with your prediction of what would happen.

If Amanda already asked you about this, just say so.

Right now, we have a version where you can set the initial population of bunnies and their alleles using query parameters - like so: https://phet-dev.colorado.edu/html/natural-selection/1.0.0-dev.44/phet/natural-selection_en_phet.html?introMutations=F&introPopulation=10FF,10ff On the "Intro" screen this will play preset brown fur as dominant (F), and create a population of 10 bunnies with FF alleles and 10 white bunnies with ff alleles.

You can then watch the genetics evolve without selection factors turned on ... it basically seems to to go 75% brown phenotype and 25% which phenotype and doesn't really evolve after that. A part of me expected more and more brown bunnies, but I think I understand that you basically would end up with steady state of something like 25% FF, 50% Ff and 25% ff.

The "Proportions" graph is the best to watch this evolve.

Does this jive with your math of the situation and how it would evolve? Feel free to play around with the basic mating algorithms to make sure its looking good to you! In the initial population you can also specify Ff bunnies - I just did something simple. If you want to make brown recessive mutation, then do this introMutations=f instead of using the capital F.

Thanks for any feedback. No rush, but I hope you find it fun to play.

amanda-phet commented 4 years ago

@kathy-phet it sounds like you've reviewed the query parameters and agree they are working and usable, correct? If so, I'd like to move your questions about the model to another issue and close this one.

pixelzoom commented 4 years ago

Echoing what @amanda-phet said... This issue is about whether the query parameters work as expected, and result in the desired initial population. If there are questions about how the population evolves after that, then please do create a new issue.

pixelzoom commented 4 years ago

In https://github.com/phetsims/natural-selection/issues/9#issuecomment-634957072 on 5/27, @ariel-phet said:

I will look at the query parameters in more detail, but ...

@ariel-phet can you find some time to complete your review in the next day or two?

ariel-phet commented 4 years ago

I read over the issue from the beginning (appreciated the genetics crash course) and then played with the query parameters. Each time they behaved as expected and the simplifications made to the structure of values of the query parameters seem like nice and intuitive changes.

You certainly have to "know what you are doing" and it took me a moment to digest everything before trying the parameters, but I think the structure is very good. I highly recommend having several examples in whatever documentation we share in the teacher tips as those helped me quite a bit. I have no suggestions for modifications.

pixelzoom commented 4 years ago

@kathy-phet your feedback in https://github.com/phetsims/natural-selection/issues/9#issuecomment-635633166 was:

I played around with setting the initial population and watching the mating dynamics of the phenotypes. I think the science looks correct - and made me rethink things, but it makes sense. :) I sent if off to Megan to sanity check just the initial population setting and mating evolution of the phenotypes.

We've moved the issue of "mating dynamics" to #47, because that's a different issue than setting up the initial population. Do you have any feedback on the query parameters for setting up the initial population? (ease of use, effectiveness, correctness,...)

amanda-phet commented 4 years ago

6/11/20 design meeting: looks good!

pixelzoom commented 4 years ago

The above commit is a refinement to the population query parameter. It requires alleles to be paired in the genotype expressions. For example "5FfEe" is valid, but "5FEfe" is not.