exercism / problem-specifications

Shared metadata for exercism exercises.
MIT License
327 stars 543 forks source link

protein-translation: Confusion between proteins and amino acids #1209

Open jerith opened 6 years ago

jerith commented 6 years ago

The Protein Translation exercise incorrectly uses the word "protein" in place of "amino acid" in the codon mapping table in the description, and seems to use "protein" somewhat vaguely in the test data to refer to either a protein (represented by a list of amino acids) or an amino acid.

As a result, the Elixir track version of this exercise uses "protein" instead of "amino acid" in a bunch of places, leading to confusion and unnecessary difficulty naming things in solution code.

Please see exercism/elixir#396 for my proposed solution there. I think the table heading in the description needs to be corrected here. It would probably be helpful to be more explicit in the canonical test data, although I'm not really sure what the exact wording should be or what issues may result from changing various fields in the JSON.

rpottsoh commented 6 years ago

description.md canonical-data.json

rpottsoh commented 6 years ago

So the exercise is really an amino acids identification exercise and not a protein translation exercise? In the testdata several of the test cases only deal with single amino acids,"property": "proteins" would likely need to become "property": "aminoacids". Should the name of the exercise be reevaluated?

I am not trying to say that what you are claiming is wrong. I am just brainstorming (out loud) some potential ramifications. Others will likely weigh in on this issue, lets see what others think.

jerith commented 6 years ago

I think "protein transcription" is the correct name, because we're translating an RNA sequence into a protein. However, we're doing this by splitting the strand into codons, looking up the amino acid each codon represents, and then assembling the amino acids into a protein.

The exercise is very clear about the difference between an RNA sequence and a codon, but less clear about differentiating between a protein and an amino acid.

rpottsoh commented 6 years ago

@jerith I have been comparing the testdata to the elixir test suite and the test suite appears to differ quite a bit from the test data. Maybe I am not reading the test suite correctly, but it appears that more than one property is being tested, of_rna and of_codon. The testdata is only testing one property,proteins. Do you think renaming the property proteins to aminoacids would clear up this issue as far as the testdata is concerned?

rbasso commented 6 years ago
Exercise naming

I'm not a biologist or anything like that, but I think that, ideally, the "genetics exercises" should be named:

Property naming

Considering that in both problems the only tested properties of the solutions are some outputs of individual functions (| methods), it makes sense to name the properties after the functions (| methods), as we do in most, if not all, exercises.

In rna-transcription, when mapping from DNA to RNA, I suggest transcribe instead of toRna. In protein-translation, when mapping from RNA to Aminoacids / Protein, I suggest translate instead of proteins.

In both cases, using domain-specific verbs, we can better communicate what is the meaning of the operation.

Fast fix

Edit:

Considering that changing exercises' names may be a lot of work, the simplest fix to avoid confusion would be to just drop the s from proteins in this exercise , or rename the property to translate, without further changes. That would solve the semantic problem with minimal changes.