libAtoms / abcd

1 stars 4 forks source link

Interfaces different databases (future) #27

Open fekad opened 5 years ago

fekad commented 5 years ago

Caching data from different databases. Optimade could be an ideal interface for different databases.

New subcommands for abcd: abcd cache [optimade_url] -q [query] -p [extra_properties]

Other features:

fekad commented 4 years ago

There are multiple ways to create and interface for optimade. @gabor1 Which one of the following represents the most the desired function:

gabor1 commented 4 years ago

I think this caching functionality needs more thought - for example, one idea is that it should happen in the background without the user being aware of the caching, so no separate subcommand, but the full query is available (because it is cached). But clearly that is a lot of work. So in the meantime, I suggest that we do the first thing, which is that the user can login to an optimade address and use restricted queries. I like your idea of the download/upload extension, but it's unclear to me whether this should be part of the download or the upload!

fekad commented 4 years ago

Caching in the background is not necessarily more work than the other solutions, but potentially we have to transport a huge amount of data (eg. almost the whole nomad db if you want a histogram on the position) continuously. Even if you use smart caching (downloading only the actual property if is not already available locally) won't help too much.

fekad commented 4 years ago

In the case, the first option you can have only limited query options (only those which are supported by optimade) and the summary will be able only how much data is available for a specific property but no histograms

gabor1 commented 4 years ago

I guess that is where the 'cache' command would come in useful, because the user would explicitly direct the download of given properties (and often filtered by -q queries, so not the entire database). This suggests that we MUST have the limited functionality interface directly, otherwise how else would I discover what Nomad has, and what I might want to cache?

fekad commented 4 years ago

How should we handle the units, more generally how should we keep track any changes after "caching"? Should the cached data be read-only?

gabor1 commented 4 years ago

well, this opens up the whole issue of “data mapping”, which needs config files if we want to do it properly. So in keeping with the simplicity of abcd, I suggest we do nothing to the data right now. as for writing, we could have a property like “cache_dirty” which means it has been edited, and the cache subcommand could be used to refresh it. I have a nasty feeling this needs a bit more planning...

-- Gábor

Gábor Csányi Professor of Molecular Modelling Engineering Laboratory Pembroke College University of Cambridge

Pembroke College supports CARA. A Lifeline to Academics at Risk. http://www.cara.ngo/

On 21 Oct 2019, at 14:47, Adam Fekete notifications@github.com wrote:

How should we handle the units, more generally how should we keep track any changes after "caching"? Should the cached data be read-only?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.