ModelSEED / ModelSEEDDatabase

This repository contains the definitive copy of the biochemistry and metadata used to construct models using the ModelSEED/ProbAnno approach
Other
55 stars 38 forks source link

Invalid value in deltag and deltagerr fields #3

Closed mmundy42 closed 9 years ago

mmundy42 commented 9 years ago

There are 1088 compounds in the Biochemistry/compounds.default.tsv and Biochemistry/compounds.master.tsv files where the value of the the deltag field is "deltaG" and the value of the deltagerr field is "deltaGErr". Should the values be reset to 10000000 to indicate the value is unknown?

mmundy42 commented 9 years ago

In Biochemistry/reactions.master.tsv there are 68 reactions where the value of the deltag field is "deltaG" and 63 reactions where the value of the deltagerr field is "deltaGErr".

In Biochemistry/reactions.default.tsv there are 621 reactions where the value of the deltag field is "deltaG" and 455 reactions where the value of the deltagerr field is "deltaGErr".

samseaver commented 9 years ago

This shouldn't be appearing, I should have checked more closely. The function that updates the values so that they should read "null" is not recognizing specific unusual instances and prints the column name instead.

samseaver commented 9 years ago

Fixed in https://github.com/ModelSEED/ModelSEEDDatabase/commit/cb304d53b03d1c3a8bad86a40f26fd7aa294c544

samseaver commented 9 years ago

I'm re-opening this as a placeholder for dealing with default values in the biochemistry. It occurred to me that right now we have 2, maybe 3 default values for deltag and deltagerr which are 'null', '10000000' and possibly '0' (I'm inclined to believe that '0' is not an actual result)

mmundy42 commented 9 years ago

Confirmed that the original problem is fixed and deltag and deltagerr fields now have "null" instead of the column name.

I'd prefer to use "null" as the default value to be consistent with other columns. In reactions.master.tsv there are 4,405 reactions with a deltag value of 0 and all of them have a deltagerr value of 2. In compounds.master.tsv there are 63 compounds with a deltag value of 0 and 59 compounds with a deltagerr value of 0.

mmundy42 commented 9 years ago

Also, I noticed the is_obsolete and linked_reaction fields were removed from reactions.default.tsv and reactions.plantdefault.tsv. Are those fields optional now?

samseaver commented 9 years ago

If I remember rightly, those fields were specifically created for the master list.

mmundy42 commented 9 years ago

Just a note that a value of "null" sometimes needs to be converted to a value to work with the biochemistry objects. The compounds.master.tsv file has 9,602 compounds with "null" for the charge but the Bio::KBase/ObjectAPI/KBaseBiochem/Compound object requires that charge is a number. And the add_compound() method in Bio::KBase/ObjectAPI/KBaseBiochem/Biochemistry requires an empty array if there are no aliases.

samseaver commented 9 years ago

We went will null instead of the previous default which was 10000000. The idea is to indicate that charge is not available, but in the event of building the biochemistry object, we'd have to pick a default number to use instead. I'm against picking zero because it's a valid charge and one we don't want to use for compounds with an unknown charge, so we could revert to using 10000000. The high number at least makes it easy to pick out why some reactions are unbalanced.

cshenry commented 9 years ago

I think we should alter the object API and specs to allow undefined values I think� Also, I find it dubious for a compound to have an unknown charge. Unknown formula or structure maybe� but unknown charge? If the formula is �R�, I think you can reasonably safely say the charge is 0.

On Jul 10, 2015, at 1:55 PM, samseaver notifications@github.com wrote:

We went will null instead of the previous default which was 10000000. The idea is to indicate that charge is not available, but in the event of building the biochemistry object, we'd have to pick a default number to use instead. I'm against picking zero because it's a valid charge and one we don't want to use for compounds with an unknown charge, so we could revert to using 10000000. The high number at least makes it easy to pick out why some reactions are unbalanced.

� Reply to this email directly or view it on GitHub.

samseaver commented 9 years ago

OK, so you're comfortable with my setting the default value for charge to zero?

chian commented 9 years ago

As long as there is another indicator for unknown, I think this sounds good

Sent from my iPhone

On Jul 10, 2015, at 2:26 PM, samseaver notifications@github.com wrote:

OK, so you're comfortable with my setting the default value for charge to zero?

— Reply to this email directly or view it on GitHub.

mmundy42 commented 9 years ago

@samseaver, anything else to do with this issue? When building objects from the files, if a "null" is found, I don't specify the value and let the object pick the default value.

samseaver commented 9 years ago

Ya, seems redundant at the moment.