materialsproject / emmet

Be a master builder of databases of material properties. Avoid the Kragle.
https://materialsproject.github.io/emmet/
Other
52 stars 64 forks source link

[Idea]: User-friendly access to properties #840

Open davidwaroquiers opened 11 months ago

davidwaroquiers commented 11 months ago

Problem

It would be nice to be able to access some deep-buried properties in a more user-friendly way.

One example is the magnetic density which can be accessed in the output document as: doc["output"]["calcs_reversed"][0]["output"]["mag_density"]

It is a bit "involved" and not easy to get for the average user.

Many properties are available as first level, e.g. density, bandgap, ... Of course, you don't want to have gazillions of those and only keep the most important ones.

Would something like that be a desirable/acceptable addition to emmet ?

This questioning came out while providing training and helping other users to use atomate2 (ergo emmet documents) and is somewhat difficult to explain where it is or provide an easy way to find where to find which information.

Proposed Solution

I would see two solutions:

Alternatives

Alternative is just to keep as is.

davidwaroquiers commented 11 months ago

Just to complement, here is the diagram of the TaskDoc model. Found this erdantic package (https://github.com/drivendataorg/erdantic), which is quite nice.

taskdoc

munrojm commented 11 months ago

I think you are generally right about individual property methods with respect to how many would need to be added. However, if a property like magnetic_density is sufficiently important for people, I would argue a dedicated root-level attribute to get the data should probably be there. My guess is the list of those may not be super large, in which case your first proposal might be okay.

For your second idea, do you envision people using the helper method to effectively search the nested schema for mention of some property?

munrojm commented 11 months ago

Just to complement, here is the diagram of the TaskDoc model. Found this erdantic package (https://github.com/drivendataorg/erdantic), which is quite nice.

Thanks for posting this, I really like it.

davidwaroquiers commented 11 months ago

Just to complement, here is the diagram of the TaskDoc model. Found this erdantic package (https://github.com/drivendataorg/erdantic), which is quite nice.

Thanks for posting this, I really like it.

Yes, it even has tooltips showing the full docstring when you hover on a box (if it's an svg output). It could definitely be used almost directly in the emmet documentation.

davidwaroquiers commented 11 months ago

However, if a property like magnetic_density is sufficiently important for people, I would argue a dedicated root-level attribute to get the data should probably be there. My guess is the list of those may not be super large, in which case your first proposal might be okay.

Right, I guess some additional dedicated root-level attributes would be ok. Also considering the fact that if you want to query on those, the second idea with one python-property method or method per material-property or one method for several property, this cannot be used to query in the database as it is obviously not in it.

For your second idea, do you envision people using the helper method to effectively search the nested schema for mention of some property?

The idea would be to implement a "shortcut" for some properties. (stupid) Example for the magnetic_density would be:

class TaskDoc(StructureMetadata, extra="allow"):
    ...
    def get_property(self, prop_name):
        if prop_name == "magnetic_density":
            return self.output.calcs_reversed[0].output.mag_density

Again, this does not allow querying on that (e.g. give me all the materials for which the magnetic density is larger than X) but at least allow to get it more easily than using the full "path" in the nested schema. It still raises the question of which property shortcuts to implement anyway.

davidwaroquiers commented 5 months ago

Just to complement, here is the diagram of the TaskDoc model. Found this erdantic package (https://github.com/drivendataorg/erdantic), which is quite nice.

Thanks for posting this, I really like it.

Also just to mention that this can also be directly integrated within the api documentation by sphinx (I have seen mkdocs is used for emmet-core).

I thought I'd mention it as I think it would be very beneficial to have these entity relationship diagrams in the documentation. You can find one example on our newly released atomate2-turbomole add on, e.g. here https://matgenix.github.io/atomate2-turbomole/api/atomate2.turbomole.schemas.task.html#atomate2.turbomole.schemas.task.TaskDocument, in which there is a dropdown to show the diagram: image

with the image in this case (can be zoomed of course): image

This is done using autodoc_pydantic_model_erdantic_figure = True in conf.py.

munrojm commented 5 months ago

Thanks for this, I will try and add it in this week.