darkreactions / ESCALATE_report

Transform experimental data into ML ready datasets!
http://escalation.sd2e.org/
MIT License
2 stars 1 forks source link

Generalize structure of the final dataframe for bromides #14

Closed ipendlet closed 4 years ago

ipendlet commented 5 years ago

In def reag_info(reagentdf,chemdf) of parser.py and def nameCleaner(sub_dirty_df): in jsontocsv.py there are two references to hardcoded inchi keys for the solvent and identity of the halide salt. These are used to calculate molarity and give "identity" to the reagents in a chemically meaningful way. The concentration calculation needs to be generalized to the "fill" solvent likely requiring the user to enter what is considered the solvent either first or last into the expeirmental data frame (likely this will be the first entry and tracked through the process by being identified by 'reagent 1' in the final json).

Furthermore, the final dataframe does not well differentiate the chemical identity of the bromides from that of the iodides and assumes that the user (data processing) can differentiate.

ipendlet commented 5 years ago

Also function in grabinchi.py