Open oslopanda opened 1 year ago
Hey,
Sorry for the late reply.
Indeed we do query by combination of elements instead of querying the list of elements. I do not know remember exactly why I made in this way (the queriers itself where one the first things I coded, and its actually due to a refactor), but I believe it was the following reasons:
Few of the providers did not support querying the elements in an inclusive manner for a given number of species. Ie, if you asked elements A,B,C,D up to 4 elements, meaning you want A, AB, ABC etc, one of the providers (or couple, not sure) would give back for AB elements + max 3 for example AB + something else such as EF (sorry if its confusing). For example: If you gave [Sr, Ca, Ba, O] with max to 4, some providers for Sr,Ca for example would give all ternary compounds that has Sr,Ca plus something else. And everything else for all the other possible combinations. The way to circumvent this was to pricesely build up all the possible chemical elements and query the chemical space seperately.
I wanted to keep track of the elements combinations for when querying overlapping spaces be able to avoid already downloaded spaces. Thinking back, this is quite pointless since this can be checked after querying.
I believe the queriers can be revisited and remade to be more efficient. Specially, the COD querying should be moved totally to OPTIMADE interface that I believe would give more flexibility. This can be reeinvesitgated, just now I dont have much free time to work on Xerus (sorry!), but any PR I would be glad to review and check. Any other improvemenrts are also welcome!
Hello,
Thank you for the nice work! While using Xerus, i feel there might be many repeating querying from the database? For instance a system with elements A, B, C, and D the program try to query combinations A, B, C, D, (AB), (AC), (AD), (BC),(CD)..... However, if you just query for (ABCD) you will get all the combinations from the database? Or i am totally wrong?