Because sometimes the file alias and database alias do not match (because the GRO file trims the name if it's more than 4 characters).
So with the current method, even if the formula is in the database but the resname/alias does not match (because was cut) : it will not select a lipid, then the molecular_type will stay unknown. For example: SB3-14 is cut to SB3-1 in the GRO file 4ZRY.gro, and I know it's this one (because there are also SB3-10 and SB3-12) because the formula matches.
The actual method is :
selected_row = lipid_csml_charmm_gui.loc[ (lipid_csml_charmm_gui["Alias"] == res_name_graph) & (lipid_csml_charmm_gui["Formula"] == formula_graph) ]
So my new method will be :
selected_row = lipid_csml_charmm_gui.loc[(lipid_csml_charmm_gui["Alias"].apply(lambda x: res_name_graph in x)) & (lipid_csml_charmm_gui["Formula"] == formula_graph)]
Because sometimes the file alias and database alias do not match (because the GRO file trims the name if it's more than 4 characters). So with the current method, even if the formula is in the database but the resname/alias does not match (because was cut) : it will not select a lipid, then the
molecular_type
will stayunknown
. For example:SB3-14
is cut toSB3-1
in the GRO file4ZRY.gro
, and I know it's this one (because there are alsoSB3-10
andSB3-12
) because the formula matches.The actual method is :
selected_row = lipid_csml_charmm_gui.loc[ (lipid_csml_charmm_gui["Alias"] == res_name_graph) & (lipid_csml_charmm_gui["Formula"] == formula_graph) ]
So my new method will be :
selected_row = lipid_csml_charmm_gui.loc[(lipid_csml_charmm_gui["Alias"].apply(lambda x: res_name_graph in x)) & (lipid_csml_charmm_gui["Formula"] == formula_graph)]