PhasesResearchLab / pySIPFENN

Python python toolset for Structure-Informed Property and Feature Engineering with Neural Networks. It offers unique advantages through (1) effortless extensibility, (2) optimizations for ordered, dilute, and random atomic configurations, and (3) automated model tuning.
https://pysipfenn.org
Other
20 stars 3 forks source link

Notes from first time usage and APIs #17

Open bocklund opened 6 months ago

bocklund commented 6 months ago

calculate_KS2022_randomSolutions always prints all input in parallel mode

https://github.com/PhasesResearchLab/pySIPFENN/blob/b3bec0478d9faf0476c58072b617ee5710621dc9/pysipfenn/core/pysipfenn.py#L612

It may be kind of noisy to always print the input without being behind a verbose flag if the compList is very long (especially since it's a lot of pymatgen Structure objects). Feel free to close

amkrajewski commented 6 months ago

It's a leftover from some debugging. Already fixing it!

bocklund commented 6 months ago

I have some more general notes, and might just leave them here. Feel free to convert any to issues or I can elaborate if needed.

  1. It's surprising that SIPFENN mutates input. For example, calling Calculator.calculate_KS2022_randomSolutions with compList=composition_strs where composition_strs is some list of strings that I created, calculate_KS2022_randomSolutions will mutate the composition_strs list that I passed in and change it's type from List[str] to List[Composition].
  2. Calculator.loadModelCustom seems to require a separately specified model directory and network name here. Maybe it would make more sense to take in a single argument for the path to a network file? If you want to support local directory with paths, I think that should be possible too. To be extra, you could take in a list of paths to search for NNs by name (similar to the PATH variable used by operating systems)
  3. related, it might be a nice convenience if the modelNamein loadModelCustom was simply the networkName by default. At least for some of the methods I've been using, like get_resultDicts() it doesn't seem like the name I provided is used anyway.
  4. It seems inconsistent that Calculator.loadModelCustom hardcodes a onnx extension to the file, but other places like writeDescriptorsToCSV documents that "The file must have a '.csv' extension to be recognized correctly", but this actually isn't a requirement of the code (and isn't enforced anyway) and seems to be more of a suggestion if users want to use the CSV with something that depends on file extensions to recognize the type. I would err on the side of having users include full filenames with extensions, and only using extensions to the extent that they are hints about the type of file being read.
  5. (more of a bug, but I'll leave it here) Calculator.get_resultDictsWithNames fails without a nice warning or error if the inputNames member isn't set on Calculator instances
amkrajewski commented 6 months ago

@bocklund Thank you for taking your time and providing great feedback! As always 😃