Bad Co data - Githubissues

shyuep commented 3 years ago

As mentioned in the email thread, the recent database updates has caused SEVERE issues with Co materials. E.g., Layered LiCoO2 is 200 meV/atom above hull. @computron and I have debugged and it is clear that this is because:

The new static runs are being done with Co high spin.
These new runs are blessed despite having much higher energies (0.5 eV/atom) than previous tasks.

You can prove this is the case by searching for LiCoO2 in both prev.materialsproject.org and www.materialsproject.org.

I recommend the following corrective steps:

Immediate - rebuild and release a database that only include Co static runs if the static runs are not more than 50 meV/atom higher in energy than the lowest energy in all previous relaxation and static tasks. This is of immediate priority. Right now, MP is basically reporting garbage for all Co compounds.
Immediate - write a validator that forbids new calculations with the same functional to be more than 50 meV/atom higher in energy than the lowest energy structure for all cases.
Within 1-2 months: Redo all Co static calculations in low spin.

mkhorton commented 3 years ago

In brief,

This change came as a result of preferring static calculations with the new input set, i.e. not considering old relaxations. This was done out of an abundance of caution since the new set has LASPH: True (as strongly recommended by VASP), while the historical calculations did not, and also because the relaxations themselves had some serious issues and standardizing on a modern set of calculations from a consistent VASP version seemed to resolve a lot of these issues (eg some old relaxations were missing LSORBIT being set correctly, some relaxations had bad Pulay stresses or under-converged k-points, some had U values permuted or reported as being permuted). Since we were re-running all materials regardless, standardizing on the new set of statics seemed safe. However we were not aware of the custom calculations for some Co materials.
This will reject useful calculations, since information about magnetic orderings is contained in calculations > 50 meV/atom different in energy. As long as these high-energy calculations are not blessed, it shouldn't be an issue.
This is currently happening.

The re-build and release is planned, but I think will take comparable time whether we patch in builder logic to 'downgrade' or wait for the revised calculations to complete and build those in with the current builder logic, so my preference is for the latter since it requires simply including the new data.

Importantly, there are additional issues we're also addressing with the new calculations (see the ICHARG issue in devops), so it's important that we get the release out with new calculations for that reason too. There are more bad energies than just Co.

Beyond this immediate issue, the next step is to re-optimize a subset of these older calculations, since many have alarmingly high forces. We will not start on this until the present issue is fixed and deployed, but it will also be important to ensure the integrity of the database.

shyuep commented 3 years ago

Note that for 2, I am meaning that the blessed task cannot be more than 50 meV/atom above ANY other calculation unless explicitly allowed. Even if there are 10000 magnetic orderings with energies above the ground state, that is not a problem. What is a problem is when one of those non-ground state tasks become the blessed task.

The "downgrade" is independent of the new runs. The new runs ensure you have more accurate static calculations. But the downgrade, i.e., blessing the lowest energy task regardless of whether it is a "new" static, is something that needs to happen to prevent future occurrences of this problem. You need to push the new blessing code anyway and you might as well release the new db instead of waiting for the new runs to be ready. You can then rebless the collection after your new runs are done and release another DB version. The less time we have things like LCO being 200 meV/atom above hull on the website, the better. Unless you are saying that the runs will be done within a day or two, we should address the bug fix.

computron commented 3 years ago

I'm 100% with @shyuep on this one. The procedure we have tried to use historically, but which requires a lot of close coordination with team members (easier with a smaller team) would look sometihng like:

within 1 hour of noticing a problem - A banner about the data on the web site is put up, informing users that a problem with some of the recent data has been noticed and a fix is on the way
within 3 days of noticing a problem - A revised "blessing" code has been applied to the old data and the old database fixed for these obvious problems. Banner is removed.
hard to say when - a new database release with new data is pushed out (honestly, I don't think we were all that much faster than today in terms of releasing new databases, this has always been a problem).

MP usage is only increasing (just recently I reviewed another paper that uses MP as its sole data set, and of course relied on linking to MP for the original data, didn't snapshot it for their paper).

I am going to try to set up a meeting to try to get at least (1) and (2) under control ...

materialsproject / emmet

Bad Co data #120