materialsproject / api

New API client for the Materials Project
https://materialsproject.github.io/api/
Other
107 stars 39 forks source link

MPRester summary.search arguments chemsys and exclude_elements not working together #766

Open matthewcarbone opened 1 year ago

matthewcarbone commented 1 year ago

It appears that when making a query in which a chemsys is requested with the exclude_elements argument that things do not quite work as intended. For example,

with MPRester() as mpr:
    docs = mpr.summary.search(
        chemsys=["Ti-O-*"], 
        fields=["material_id", "structure"]
    )

leads to 1111 structures, whereas

with MPRester() as mpr:
    docs = mpr.summary.search(
        chemsys=["Ti-O-*"], 
        exclude_elements=["N"],
        fields=["material_id", "structure"]
    )

leads to pulling 61803, when it should be a subset (less than 1111).

This is a minor thing, since one can probably just screen out elements with Nitrogen after the fact manually, but I do think that there should at least be a mechanism for preventing users from using these two arguments together (or it's a bug, or I've missed something). Thanks!

munrojm commented 1 year ago

@matthewcarbone, thank you very much for bringing this to our attention. It should definitely behave as you describe. I will take a closer look.

matthewcarbone commented 1 year ago

@munrojm no problem! It's strange, I think the query is constructed properly. Might be a problem with the query engine itself? Let me know if there's anything I can do!

munrojm commented 1 year ago

Yeah, I suspect this is a server-side issue. This function is what translates the chemsys search request to a MongoDB query. For wildcard searches it uses the elements field, similar to when specifying a list of elements to exclude. It looks like these two query operators will have to be merged with some added logic to fix the issue. I'll take a look at fixing this in the next few days.