Unified `transformations` module

ipcamit commented 10 months ago

@mjwen @yonatank93 I would like to have your opinions on this new module I am currently testing for KLIFF. Idea is to unify all transformation under 3 categories

Parameter transforms (as already discussed in previous issue)
Property transforms (takes in a dataset/ list of configuration, and performs collective transform on any of attribute)
Configuration transforms (Takes in a Configuration, and applies transforms like graph calculation/ descriptors)

I think this organizes it in more principled fashion, what do you guys think? Current organization is:

KLIFF
- transforms
- property_transforms.py
- parameter_transforms.py
- configuration_transforms
- graph_generator
- descriptors

Major changes to existing code:

Current descriptors to be moved to legacy/descriptors
The Configuration object will have a fingerprint property can can store the descriptor for reuse or transform. So if you want to normalize the descriptors, you would apply property transform to the property "fingerprint".

Comments and suggestions?

mjwen commented 10 months ago

@ipcamit

I love the proposal. It makes totally sense to me. Just have one question:

For Property transforms, we will implement the inverse transform too, right? This will make it possible to fit transformed values, but at the prediction time, the model will give the correct value after the inversed transform.

ipcamit commented 10 months ago

In the transformation class, yes. There are both transform and inverse method. But you highlight one issue that I have been not sure of.

In ML models like Nequip, the training vs evaluation time difference is baked in the model. Same approach KLIFF ML trainer will take.

Only way to work with physics based models is that our trainer has to predict the values using model, then use property transform to have both predicted property and actual property in same space. This way the model weights are in actual space, but optimizer works with transformed property losses. So it does not make much difference for classical models.

With this I will start putting the training workflow together. Should only take couple of days. Perhaps we can schedule another meeting afterwards to discuss what is left?

mjwen commented 10 months ago

Sounds great!

Yes, let's schedule a meeting to discuss it afterwards.

ipcamit commented 10 months ago

Hi, @mjwen and @yonatank93, I have just commited the newest trainer module to kliff that includes a consistent yaml based trainer for physics based models and ml models, including a torch lightning trainer. With this I am not sure anything else remained othe rthan polish and documentation. Could you both provide a meeting time to discuss way forward? I was hoping for a beta release before end of November. Only significant thing left is kliff-layers module which will give kliff capability to easily generate models based on nequip and mace. Nequip one I have working code and will contribute soon, mace may be in future. It will also be home of pretrained layers like M3GNet for transfer learning (and hopefully FERMat when it is ready). In the mean please let me know when both of you could be free for 45 min zoom meeting (preferably this week, but nothing urgent is there so please don't feel pressurized!)

yonatank93 commented 10 months ago

Thanks @ipcamit I haven't had a chance to test your updates with the UQ stuff. But if we want to meet this week, I will be available on Tuesday between 9:30 am (after the KIM meeting) and 3 pm, Wednesday before 4 pm, or Thursday before 2 pm. All of these are in Mountain time.

mjwen commented 10 months ago

Let's meet tomorrow (Tuesday) after the KIM meeting?

ipcamit commented 10 months ago

Sure, I have sent Zoom invitation in e-mail

ipcamit / kliff

Unified `transformations` module #4