stfc / janus

collection of scripts to train and generate data for machine learnt interatomic potentials
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

Fix duplicate funcs #13

Closed ElliottKasoar closed 1 year ago

ElliottKasoar commented 2 years ago

Resolves #5 by adding a new tag to each set of symmetry functions. This is used to check if a set is already present, then delete the section if the same set is being written.

This also has the benefit of making it clearer which sets would be overwritten and which sets would be appended if parameters are changed.

dave452 commented 2 years ago

Will this only work with input.nn files generated using this version of the code?

ElliottKasoar commented 2 years ago

Will this only work with input.nn files generated using this version of the code?

Currently, yes. If we want to make it more generally, there are a couple of options

  1. Implement it in the alterative way I suggested in #5 where we skip the identifier and infer straight from the file, so new files look the same as old ones. This is not an insignificant change, but I don't think it would be too bad either
  2. Separate out the functionality of adding the identifier with deleting duplicate sets. This has some of the benefits of both methods - both forms of file could work, as well as more clearly labelled sets, but probably requires more or less the same code as (1) since we couldn't (always) write the identifier at the same time as we generated the functions, so we'd need to be able to infer in the same way

It's also worth noting that the current implementation, and most straightforward alternatives, require some assumptions about the format of input.nn. In addition to currently assuming identifiers for each set, the current implementation is also based on the distribution of #s, while the alternative (1) would likely similarly assume the use of #s to separate sets, as well as the format of variables in the set comments.