WIP: Adding weight information to KLIFF

yonatank93 commented 2 years ago

Currently, I just added an option to pass in c1 and c2 via weight_params argument in Dataset and Configuration. The weights are calculated by $1/w = c1^2 + c2^2 * \Vert f \Vert^2$. The property config.weight is a dictionary containing the computed weights in this way. I modified the default residual function to use these weights.

mjwen commented 2 years ago

Yes. Ecactly

From: Yonatan Kurniawan @.> Sent: Friday, April 22, 2022 6:49:38 PM To: openkim/kliff @.> Cc: Mingjian Wen @.>; Mention @.> Subject: Re: [openkim/kliff] WIP: Adding weight information to KLIFF (PR #48)

@yonatank93 commented on this pull request.

In kliff/loss.pyhttps://github.com/openkim/kliff/pull/48#discussion_r856754602:

         "normalize_by_natoms": True,

}

     residual_data = _check_residual_data(residual_data, default_residual_data)

_check_compute_flag(calculator, residual_data)
_check_compute_flag(calculator, residual_data)

So would something like the following inside _check_compute_flag do?

ew = weight.energy_weight
if calculator.use_energy and ew < 1e-12:
     logger.warning(msg.format("energy", ew))

— Reply to this email directly, view it on GitHubhttps://github.com/openkim/kliff/pull/48#discussion_r856754602, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACN3C6MD4QX7OSCW2BLXUKDVGNJLFANCNFSM5T5IDCAA. You are receiving this because you were mentioned.Message ID: @.***>

mjwen commented 2 years ago

@yonatank93 let me know when it's ready for me to review

yonatank93 commented 2 years ago

@mjwen I am still working on updating the documentation for the loss function. I am also thinking of adding documentation about the weight in the dataset section. I think you can review the other changes, although I still have the question about https://github.com/openkim/kliff/pull/48#discussion_r856893604.

mjwen commented 2 years ago

@mjwen I am still working on updating the documentation for the loss function.

No rushes at all!

I am also thinking of adding documentation about the weight in the dataset section.

Great! It would be good to refer the doc about weight added in dataset in the doc for loss function, since it is the place where the weight is actually used.

yonatank93 commented 2 years ago

@mjwen Please review the documentation I created/changed in KLIFF.

yonatank93 commented 2 years ago

@mjwen Thanks for catching some of the typos. Is an example for the weight necessary, on top of the documentation? If no, then I don't have anything to add/change.

yonatank93 commented 2 years ago

@mjwen I brought up this weight implementation in KLIFF diring KIM meeting today. There is a significant input, what if we retrieve the weight as a dictionary? That is, maybe we will have weight.get_weight() that returns

weight = {"energy_weight": [...], "forces_weight": [...], "stress_weight": [...]}

The thought is that it will be easier to extend the weight class if we have other material properties, aside from energy, forces, and stress.

mjwen commented 2 years ago

@mjwen Thanks for catching some of the typos. Is an example for the weight necessary, on top of the documentation? If no, then I don't have anything to add/change.

The current examples already used the Weight class, although not explained in great detail. I believe this is good enough given that we have nice documentation for Weight class.

mjwen commented 2 years ago

@mjwen I brought up this weight implementation in KLIFF diring KIM meeting today. There is a significant input, what if we retrieve the weight as a dictionary? That is, maybe we will have weight.get_weight() that returns
weight = {"energy_weight": [...], "forces_weight": [...], "stress_weight": [...]}
The thought is that it will be easier to extend the weight class if we have other material properties, aside from energy, forces, and stress.

I think our current approach should work better, because

the need to fit to other properties is really small; not so many people these days are fitting to other properties
all the KLIFF framework is coded up using energy/forces/stress as reference data. If a user wants the fit to new properties, all other parts needs to be modified. Then, it's fine to modify Weight as well.
our weight.energy_weight class properties approach is cleaner than returning a gigantic dict. I personally prefer simplicity and easier documentation than extensibility. If a user really wants to fit to other properties, the Weight class can be subclassed and new @property can be defined.

yonatank93 commented 2 years ago

Ok. I will keep this in mind as an alternative way to if we need update the weight in the future.

There is another issue I want to bring up. For this update, I deleted the built-in residual functions, and only keep energy_forces_residual. My thought was that the forces_residual and energy_residual only set the respective weight to zero and use energy_forces_residual. So, the same action could also be done by setting the weight in Weight directly, and thus there is no need in defining these 2 residual functions. However, suppose that when creating calculator, I specify use_forces=False, hoping to speed up the calculation. But it seems that the built-in residual function doesn't know if we want to use certain property or not, because it always assumes that the first element of the prediction array is always energy and the rest to be forces. So, should we either:

Update the built-in residual function? Maybe append the use_forces, use_energy, etc from the calculator to residual_data? or
Create other built-in residual functions?

mjwen commented 2 years ago

However, suppose that when creating calculator, I specify use_forces=False, hoping to speed up the calculation. But it seems that the built-in residual function doesn't know if we want to use certain property or not, because it always assumes that the first element of the prediction array is always energy and the rest to be forces. So, should we either:

Update the built-in residual function? Maybe append the use_forces, use_energy, etc from the calculator to residual_data? or

Create other built-in residual functions?

Good catch! The previous energy_residual and force_residual was not correct. Depending on the use_energy and use_forces flag, the prediction may be different. In retrospect, this design is bad. We will need to modify this in the next version.

But for now, I think let's create separate residual functions for different cases (your second option), just to ensure the the interface of a residual function is the same. Then this part also needs to be be updated: depending on the use_energy... flags of the calculator, we need to choose a different residual function.

yonatank93 commented 2 years ago

I think in KLIFF, we can specify use_energy... for each configuration separately. If we update this part with if else statement, it seems that we assume that use_energy... has the same value for all configurations. So, are we going to ignore this for now?

mjwen commented 2 years ago

Technically, use_energy can be specified for each config separately. But we do not expose any user interface for doing that. More specifically, a single use_energy is provided by a user via a calculator, and this single value will be applied to all the configurations. So, it's safe to do what's mentioned above.

If a user does not want to use energy for a configuration, this has to be achieved via the energy_weight through the new Weight class.

mjwen commented 2 years ago

All looks good; merged. Thanks!

openkim / kliff

WIP: Adding weight information to KLIFF #48

_check_compute_flag(calculator, residual_data)