EnzymeML / PyEnzyme

🧬 - Data management and modeling framework based on EnzymeML.
BSD 2-Clause "Simplified" License
23 stars 9 forks source link

pyenzyme does not allow to change the units #29

Closed fbergmann closed 2 years ago

fbergmann commented 2 years ago

I would expect the following code to change the unit of a protein, this is however not the case:

enzmldoc = EnzymeMLDocument.fromFile("EnzymeML_Lagerman.omex")
print(f'original unit {enzmldoc.getProtein("p0").unit}')
enzmldoc.getProtein("p0").unit = 'mmole / l'
print(f'changed unit {enzmldoc.getProtein("p0").unit}')
enzmldoc.toFile(".", name="Test")
enzmldoc = EnzymeMLDocument.fromFile("Test.omex")
print(f'read unit {enzmldoc.getProtein("p0").unit}')

i get the output:

original unit umole / l
changed unit mmole / l

Archive was written to .\Test.omex

read unit umole / l

i would have expected it to be mmole in the final document.

JR-1991 commented 2 years ago

Thanks for submitting the issue! I fixed this by doing a last check up at writing the document if the unit-string given in the unit-attribute is compliant with private attribute _unit_id. If not, it will point or create the appropriate UnitDefinition compliant to the unit-string.

However, it is not a satisfactory solution, because at the moment the creation of the UnitDefinition happens either when added to the EnzymeMLDocument or when finally written to SBML. It would be awesome that this happens whenever the unit-attribute changes, but this requires the EnzymeMLDocument and when setting an attribute a second argument can't be given.

Maybe I am missing something which could enable this, but for now I think the compromise is the best to maintain the general workflow. Happy for ideas though!

fbergmann commented 2 years ago

I think what should happen in any case is that if the unit is changed to something else, the internal _unit_id should be reset to None. That way you can easily fix it when writing out, and you wont encounter an inconsistency. The code in your fix looks fine.

JR-1991 commented 2 years ago

I found a way to fix this by linking the EnzymeMLDocument to any object that possesses a unit. This way, whenever a unit changes, the new unit will be created, assigned to the unit_dict and finally the new ID set to the _unit_id attribute. Hence, the new unit will then be written to the SBML document as expected.

The code you submitted now works as expected. I had to change the target unit as it has been corrected in the new OMEX archive.

enzmldoc = pe.EnzymeMLDocument.fromFile("EnzymeML_Lagerman.omex")
print(f'original unit {enzmldoc.getProtein("p0").unit}')
enzmldoc.getProtein("p0").unit = 'umole / l'
print(f'changed unit {enzmldoc.getProtein("p0").unit}')
enzmldoc.toFile(".", name="Test")
enzmldoc = pe.EnzymeMLDocument.fromFile("Test.omex")
print(f'read unit {enzmldoc.getProtein("p0").unit}')

Out:

original unit mmole / l
changed unit umole / l

Archive was written to ./Test.omex

read unit umole / l