Closed CompRhys closed 7 months ago
The original code won't work today due to
Instead, I will state an outline of the MPtrj parsing process:
Query all the exisiting mp_ids
Query all the mp_ids
from the mp_ids
with
task_types = ['GGA Static', 'GGA Structure Optimization', 'GGA+U Static', 'GGA+U Structure Optimization']
For each task queried from mpr.tasks.get_data_by_id
, check their calculation compatibility with the associated thermodoc entry queried from mpr.get_entry_by_material_id
This includes:
INCAR
setting checksFor trajectory frames that passed step 3, use pymatgen StructureMatcher
, to ensure frame similarities are low.
Understood re the API calls, still believe that it would be great to share a programatic example capturing the screening process (steps 3/4) particularly so that people can extend it with additional rules or recreate something similar on things like OQMD. Code is the fundamental way we ensure our work is reproducible.
This is further addressed in here
Email (Optional)
No response
Problem
It would be great it we could have an example notebook showing the MPtrj query pattern and cleaning. It worth noting that the MP query would need to be pinned to the v2021.11.10 to arrive at the same dataset as current MPtrj but having the notebook would enable users to recreate similar datasets for newer releases like v2023.11.1 where a large number of materials have both been added and deprecated.
Proposed Solution
Notebook should by default have a
smoke_test
version that would only perform the cleaning on a smaller query.Alternatives
No response
Code of Conduct