ihmwg / python-modelcif

Python package for handling ModelCIF mmCIF and BinaryCIF files
MIT License
9 stars 1 forks source link

Duplicated entries in _ma_software_parameter #36

Closed gtauriello closed 1 day ago

gtauriello commented 1 day ago

I am struggling with duplicated rows on ModelCIF...

I have a protocol to dump to ModelCIF where I have multiple variants of a given software package (in the problematic case: the same SW with differing SW parameters). Then for the QE metrics, I needed a SoftwareGroup to group all those variants. To avoid duplications in the dumper I made sure to reuse the exact same SoftwareWithParameters objects when putting together the metadata.

Example script and output file: test_sw_params.zip (to simplify my actual setup I used an additional protocol step with the SoftwareGroup combining the multiple SoftwareWithParameters objects)

My concrete expectation in the resulting test.cif in the example would have been to have

loop_
_ma_software_parameter.parameter_id
_ma_software_parameter.group_id
_ma_software_parameter.data_type
_ma_software_parameter.name
_ma_software_parameter.value
_ma_software_parameter.description
1 1 integer custom_setting 1 .
2 2 integer custom_setting 2 .
#
#
loop_
_ma_software_group.ordinal_id
_ma_software_group.group_id
_ma_software_group.software_id
_ma_software_group.parameter_group_id
1 1 1 1
2 2 1 2
3 3 1 1
4 3 1 2

instead of

loop_
_ma_software_parameter.parameter_id
_ma_software_parameter.group_id
_ma_software_parameter.data_type
_ma_software_parameter.name
_ma_software_parameter.value
_ma_software_parameter.description
1 3 integer custom_setting 1 .
2 4 integer custom_setting 2 .
3 3 integer custom_setting 1 .
4 4 integer custom_setting 2 .
#
#
loop_
_ma_software_group.ordinal_id
_ma_software_group.group_id
_ma_software_group.software_id
_ma_software_group.parameter_group_id
1 1 1 3
2 2 1 4
3 3 1 3
4 3 1 4

Any chance to fix that?

benmwebb commented 1 day ago

Right, the dumper for software parameters is rather simplistic and just enumerates every parameter for every software; I didn't consider the possibility that multiple software might share the same parameters. But this should be easy enough to fix.

gtauriello commented 11 hours ago

Awesome. Thanks for the quick fix.