aiidateam / aiida-pseudo

MIT License
5 stars 8 forks source link

Basis set management? #45

Open zooks97 opened 3 years ago

zooks97 commented 3 years ago

As I'm starting to play with local-basis DFT codes (e.g. OpenMX, Siesta, ORCA, Gaussian, etc.), it's become clear that along with pseudopotentials, one has to manage basis sets in a very similar way.

The framework for this would be basically identical to what we do here in aiida-pseudo, and as such, do you think it would be better to extend aiida-pseudo to also support managing basis sets or rather make a parallel aiida-basis plugin explicitly for that purpose?

I need to think it over a bit more, but there could be significant shared code between the efforts, and it may be easier for both efforts to benefit from a shared foundation.

bosonie commented 3 years ago

As a first comment I would say that the task is not as easy as it seems. Optimal basis sets most likely depend on the chemical environment. In SIESTA there are almost 20 years of development and nobody even dare to create a database of basis. I guess that with the advent of high-throughput we can definitely gather some systematic knowledge, but so far I see much more useful to have a basis optimizer that run on the system of interest before doing the calculation. In any case I would start this massive project separately from aiida-pseudo.

zooks97 commented 3 years ago

I'd be interested to understand the problem better; maybe we can get in touch and discuss sometime.

Like you mention, I think that the most important application would be high-throughput, but I think it could also make using basis sets from, for example, the BSE easier and more provenance-friendly.

As I've thought about it a bit more, I think I agree that starting a separate plugin, while maybe taking some cues from aiida-pseudo, would be a better path forward.

bosonie commented 3 years ago

Yes sure, we can have a chat in a week or two. Let me know!

sphuber commented 3 years ago

You could definitely start with developing this in aiida-basis, even depending directly on aiida-pseudo to reuse bits without copying and then when thinks have settled, seem to work well and there is still a lot of overlap, we can merge it

dev-zero commented 3 years ago

@zooks97 maybe also take a look at my aiida-gaussian-datatypes plugin?

zooks97 commented 3 years ago

@dev-zero Thanks for mentioning it! I saw it around the time I created this issue, but I'll give it another look. Maybe it could be possible to do something similar for numeric orbitals?

dev-zero commented 3 years ago

@zooks97 sure. From my point of view there are the following points when designing a data type plugin for basissets:

addman2 commented 2 years ago

Dear all,

I am also interested in this topic. Is there anything new since the last comment was made?

dev-zero commented 2 years ago

@addman2 what exactly are you interested in? Which types of basis sets?

addman2 commented 2 years ago

Dear @dev-zero,

details you can find in this mailing list:

https://groups.google.com/g/aiidausers/c/kdoLb-NO4LI

I will summarize. I am writing an aiida-package for our code QMC code. Mostly we are using PPs from these two databases:

https://pseudopotentiallibrary.org/ http://burkatzki.com/pseudos/index.2.html

I was thinking to put them as installable "families" inside the aiida-pseudo package. Similarly, it could retrieve the recommended basis for the PP. Mainly I am interested in GTO bases, but I don't want to be restricted to them. I was looking at your aiida-gaussian-datatypes package and it has 80% of the functionalities I was looking for. I really like the way hot Basis and Pseudo Data types were made.

The things which are missing is basically, the automatic fetcher from the internet. I can contribute on this.

azadoks commented 2 years ago

I've been working on this sporadically here. I have (mostly) working support for OpenMX PAO bases which are managed as loose files just as done by aiida-pseudo.

For GTO bases, I was working to integrate with the Basis Set Exchange python module. I generally only have experience with plane wave codes and with OpenMX, so I don't know the best way to handle, e.g., GTO bases in AiiDA (i.e. as files, as done here, or as an AiiDA data type that contains the relevant data + some code for writing that data in different formats).

I'd really appreciate any feedback, maybe over in the aiida-basis repository, @dev-zero and @addman2.

p.s. as you mentioned recommended PPs corresponding to basis sets, this is another open question of mine and why I made this issue here in aiida-pseudo first. OpenMX provides basis-pseudo pairs, and it would make sense to me to provide both with the same AiiDA plugin (although there is a many-to-one correspondence between bases and pseudopotentials respectively).

addman2 commented 2 years ago

Hi azadoks,

Sorry for my late response, been busy lately. I started playing with aiida-gaussian-datatypes in order to find out if it ispossible to use it for GTO basis and ECP I'm using. I started with PPs, it turned out PPs from the original lib were not compatible with mine, so I created abstract class Pseudopotential(Data) and two derivates, the original one and one that fits my format. I think this work out well, you can check it here.

The next step I would like to do is creation of localized basis format. I was looking at your BasisData format in aiida-basis. One thing which concerns me is the BasisData is Singlefile. I have plans to make adjustments to the Basis set before I use it. I believe a Dict (or Data) type would be more suitable.

dev-zero commented 2 years ago

@azadoks one of the things on my todo list for the aiida-gaussian-datatypes is also the import from the Basis Set Exchange. At the moment I would most likely implement it as a workflow (given an identifier the workflow fetches it and gets you a basis set object), for the sake of provenance.

@addman2 I think I could pull your changes directly into the plugin: CP2K can also support other types of pseudos (ECP), hence adding the type to the main plugin is definitely something we can and want to do. Wrt the basis sets: this is one of the reasons I decided to store the basis sets in the aiida-gaussian-datatypes plugin as a nested dict (in the database as a JSON).

sphuber commented 2 years ago

given an identifier the workflow fetches it and gets you a basis set object

Does "fetching" here mean obtain it from a URL? Because in this case, it might suffice to simply store the source in the Data node attributes. That is what it is designed for. Would be a bit overkill to go through a workflow.

dev-zero commented 2 years ago

@sphuber it may also consist of converting to the storage format