phetsims / build-a-molecule

"Build a Molecule" is an educational simulation in HTML5, by PhET Interactive Simulations.
GNU General Public License v3.0
8 stars 7 forks source link

Possible Misordering of Chemical Formula #206

Closed KatieWoe closed 3 years ago

KatieWoe commented 3 years ago

For https://github.com/phetsims/QA/issues/538 From Slack:

Kathryn Woessner:house_with_garden: 2:12 PM Should both of these molecules be CHN, even though they are put together differently?

chnforboth

Amy Rouinfar 2:13 PM Hydrogen cyanide is usually HCN . 2:14 The order in the chemical formula doesn’t technically matter, though. (edited)

Kathryn Woessner:house_with_garden: 2:14 PM It's that way in the game too

thatingame

2:14 Is it worth an issue?

Amy Rouinfar 2:14 PM Yes 2:15 I saw something similar in the collections but it wasn’t as glaring… 2:16 Looks like I don’t still have it up, so I’ll have to find it again. But it was something unconventional like O2C for carbon dioxide instead of the usual CO2 .

Denzell Barnett 2:17 PM Pub chem says the formula is HCN or CHN. What would be the ideal fix?

Amy Rouinfar 2:17 PM HCN would be ideal New 2:18 CHN is probably following a common ordering convention for larger organic molecules, but HCN implies the atoms are connected H-C-N which is preferable here.

Denz1994 commented 3 years ago

Also from the same slack thread:

Amy Rouinfar 4:23 PM H-C-N and C-N-H are both legal structures. C-H-N is not a legal structure. 4:23 We should change the goal text for hydrogen cyanide to HCN.

arouinfar commented 3 years ago

@Denz1994 if it is trivial to change the chemical formulas in collectionMoleculesData.js, I'd also like to make these changes:

KatieWoe commented 3 years ago

I'm a bit confused since I think at least some of those are already right? I think I just saw silane come up and it was right.

KatieWoe commented 3 years ago

Ammonia is wrong though. So maybe I saw silane wrong

arouinfar commented 3 years ago

I was assuming that the chemical formula that appears in the collection came directly from collectionMoleculesData.js. I am pretty sure I saw H3N and O2S earlier today. I'll play with the collections a bit more to make sure.

KatieWoe commented 3 years ago

Looks like an odd mixed bag: mixed bag

Denz1994 commented 3 years ago

Because these entries come directly from the PubChem database, we would need to review the entries or develop a post-processing step to follow a heuristic for editing the formula names. It seems like PubChem doesn't follow a pattern. It would be reasonable to at the least check the collection box goal molecules.

These are found in collectionMoleculesData.js mentioned above.

arouinfar commented 3 years ago

Of the five molecules that have an odd chemical formula ordering in collectionMoleculesData.js, three of them appear to be use the more preferable chemical formula ordering in the collection:

image image image

These two molecules use the exact chemical formula ordering in collectionMoleculesData.js, and I think the ordering is less than ideal. image image

KatieWoe commented 3 years ago

Other chemicals that are more complicated seem impacted as well: h4n2

KatieWoe commented 3 years ago

Wikipedia says the above should be N2H4

KatieWoe commented 3 years ago

Not sure of this one, so posting just in case: couldnttellpho

arouinfar commented 3 years ago

I don't think there's anything we can or should do about molecular formulas in otherMoleculesData.js. PubChem lists H4N2 as an acceptable formula for Hydrazine. I think PubChem is using a convention where the formula is ordered C > H > N > O.

To be clear, the formulas for hydrogen cyanide and ammonia are not wrong when ordered this way, but they just look weird. I think HCN and NH3 are going to be more familiar to students, which is why it'd be nice if the formulas could show up that way in the collections. @Denz1994 is that possible?

Denz1994 commented 3 years ago

@arouinfar We have an internal function used to get the formula of molecules this is in MoleculeStructure.getGeneralFormula.

More specifically here are some comments from the code:

   // carbon first, then hydrogen, then others alphabetically, otherwise sort by increasing electronegativity

This relates to the order of elements in the formula. Also here are some exception cases for H3N and CHN below.

'H3N': 'NH3', // treated as if it is organic
'CHN': 'HCN'  // not considered organic

I'll continue with the suggested changes you posted above, but I wanted to leave this here in case it gives more context.

Denz1994 commented 3 years ago

Okay, molecules should be fixed in the above commit. I'll verify it with @arouinfar.

image

Denz1994 commented 3 years ago

Here are the other examples requested above:

image

image

image

image

Denz1994 commented 3 years ago

@arouinfar Based on the images above do these formulas look correct? If so, we can close this issue.

arouinfar commented 3 years ago

@Denz1994 the screenshot for hydrogen cyanide shows CHN instead of HCN. The other examples all look good.

Denz1994 commented 3 years ago

Apologies, I've posted an image from a broken version. Here are the proper HCN molecule and goal text.

image

Denz1994 commented 3 years ago

It looks like this RC has the correct ordering. Thanks for the input. Closing this one.