Open shyamd opened 5 years ago
This by design - putting the settings in atomate leaves any changes to the default settings in atomate's control, not pymatgen's control.
e.g., if a user updates pymatgen (e.g., to get some new structure analysis feature, but which comes bundled with a change in pymatgen default settings) but leaves their atomate version alone, should their workflows change? It seems to me a user should need to explicitly update atomate in order to see new default settings.
I think that sounds reasonable if we kept up in updating atomate. The issue is that we don't update the workflows that often since a lot of them are written by postdocs who come in for a short period of time, write a workflow, and then leave. We don't want to end up with a situation where theres tons of stale technical debt simply because defaults in atomate lag behind pymatgen which a lot more people use even for running VASP.
Seconding that this is causing confusion for new atomate users. Ideally there should only be one source of truth. It also seems like an easy way for a bug to creep in if the pymatgen input sets are updated to fix an issue but atomate then overrides a particular setting.
For example, it's not clear to me why atomate has changed the default reciprocal density for MPNonSCFSet in NonSCFFW from 100 to 1000, if the value in pymatgen is too low, why not change it there?
If you guys are able to:
then I'm happy to move forward with putting everything in pymatgen.
Note that our DOS is already pretty bad as per many discussions. I'm more concerned about atomate still being too low than I am about pymatgen being closer to the a good value.
I would say that where atomate overlaps pymatgen, e.g., specifying parameters for MP-style calculations, these should be in pymatgen and atomate should avoid duplication. For other cases, there is no absolute truth, since it all depends what user requirements. I am fine if atomate has a different workflow called CapAmsRecipeForUltraAccurateDOS that is very different from MP defaults. But users who just want to perform a single calculation for comparison with MP should be able to do so without having to resort to atomate workflows.
I agree with @shyuep. Further, I think that atomate calculations should be entirely driven by PMG input sets (or any object that implements the API). To me, it seems rare that a user would actually want to pass different reciprocal densities to something like WriteVaspNSCFFromPrev
and be in a position to change that in a reasonable way. Instead, they probably want to set one and forget it (maybe it's the same as the MP set default, maybe not). Any customization outside of a suite of PMG input sets (such as the MP suite) should be achieved by subclassing PMG input sets and telling the Firetasks/Fireworks to use those.
We have been using completely different input sets more tuned for metals and we basically had to rewrite several Firetasks, e.g. WriteVaspStaticFromPrev
that assume MPStaticSet and propagate those changes up the FT/FW/WF structure.
@bocklund I see your point about being able to use input sets and having your own custom set. Basically if you want to use a non-MP input set it shouldn't be that hard to change it. That being said, i don't agree that people should have to make a new input set to make a change. The input sets should determine the defaults and the user should be able to then make other changes on top of that.
@shyuep Yes, it should also be easy to start from the MPRelaxSet, make a minor modification and give that to atomate without creating an entirely new class. In my experience, it's not easy to make modifications to the PMG input sets without creating a new class, but that's probably more a PMG issue rather than atomate.
As an aside, the current way custom VASP input sets are handled seems dangerous also, e.g. if you have a FW(struct, custom_vasp_input_set=None)
and then you supply your custom input set, it'll override whatever struct
you're also supplying to that FW without raising any error if the structures don't match? On the input sets themselves, I also realized recently there's no way to delete an INCAR argument, e.g. if you want to remove the default ENCUT, without creating a new input set (indeed, there is an input set that has explicitly added this option as kwarg as a workaround). We definitely need to improve the system for modifying the input sets that make up an atomate workflow, without using hacks.
Broadly speaking, I agree with all above + of course we are (collectively) MP, so it doesn't make sense that we've ended up in this situation where vanilla MPSet calculations have different defaults between atomate and pymatgen. I'm reminded of Conway's Law here :)
Ok I think this is a good discussion but I am not sure we are going to solve this problem over this thread, especially given that there are multiple concerns in the air now.
How about I suggest the following:
I will just have one final comment - I am confused what @bocklund means by saying that it is not easy to make a change to PMG input sets. The user_incar_settings, user_kpoints_settings, etc. all exist for that purpose.
@mkhorton The reason why it is dangerous is because the input set is supplied as an object, not a class/type or interpretable string. That is why the input set overrides structure in the atomate workflows. As for deleting parameters, we can easily add a feature that None => unsetting a parameter. E.g., a user_incar_setting = {"ENCUT": None}
would unset that particular setting for an input set. As far as I know, this would be a minor change in the DictSet and should affect no other feature since None is a special keyword that is not used in any of the settings.
I will of course not be able to attend the meeting. Deputy Dictator @mkhorton can speak on pymatgen's behalf.
I think we can continue the discussion here until the meeting actually happens
@shyamd @mkhorton what is your solution?
I have addressed some of these challenges in dfttk where we have rewritten parts of atomate that don't play well with custom input sets, such as the wrong structure problem for instances of InputSets that @mkhorton pointed out (though that "solution" is more of a hack).
IMO, a core part of this issue comes from how the WriteVasp*
tasks are implemented and I think changing those addresses most of it and the rest is just propagating the changes up the chain. If we're considering rewriting those, I'd also ask that https://github.com/hackingmaterials/atomate/issues/238 be considered if everyone at the in person meeting decides it's in scope for fixing this issue.
I'm happy to participate on any calls and offer my perspective. In my ideal world, this can reduce the amount of reimplemented code in dfttk, but I understand the difference in priorities.
@shyuep TLDR: user_XYZ_settings works, but it's a pain if you want to make the same modifications every time.
You can imagine that ultimately a user will develop custom settings are not one-offs that they want to apply to every run of atomate (e.g. I always want ENCUT to be 400 eV). Then you either have to apply user_XYZ_settings all over the place or automate it in some way. Automating it, you end up either
To give a real example of how this could be dangerous - the init() of the DictSet might sort and reduce the structure. If the structure is updated from the time an input set is created, there is no quick and easy way to make sure these sortings / reductions are properly updated without re-initializing the input set again. I can't just take the input set as given by the user and use it
So it sounds like some of the issues may be when the InputSet is being constructed. Currently, it happens twice in a lot of Fireworks, once on the construction of the Firework, and once on running the Firetask. Deciding when that should happen should make this a bit clearer. This could also depend on whats given resulting in a different firetask.
@bocklund This is a fundamental philosophical difference. The raison d'etre for having an input set is to standardize. By definition, that means modifications should be rare and exceptional. So I don't see a problem with a user writing a simple:
def get_custom_input_set(structure, *args, **kwargs):
return MPRelaxSet(structure, , *args, user_incar_settings={"ENCUT": 400}, **kwargs)
To claim this is as complex as subclassing DictSet is inaccurate.
As for sorting/reducing structures, the input set is by definition a manipulation of the structure to generate a set of inputs. Most of the time, the structure processing is done in accordance to some predefined rules, e.g., conventional/primitive standardization. By default, the only thing that DictSet does is to sort the structure, which is what you'd want in most cases to avoid having atoms of different species spread all around the input, with POSCAR species like Fe P O Fe O
.
The None to unset incar setting has been implemented. https://github.com/materialsproject/pymatgen/commit/111f44ffc450e2db8c30a923dfb664c309c76194
I have continued to give some thought on what it means (to me) to standardize. Keep in mind I am coming at this from outside the core MP group -- I have finite computer time, different projects with different accuracy needs, and I need to automate no more than a few hundred to a few thousand calculations for each. I see atomate as a way to communicate with the community and develop consensus on what the standard is for a type of calculation (e.g. a phonon calculation) in a code-agnostic fashion. To this extent, we should focus on implementing methods and let users determine the accuracy.
We can develop a guiding principle for calculation Fireworks that can be used to check a PR or proposed change against this principle. I propose the following: "Calculation Fireworks should only enforce minimally required calculation settings for the calculation type".
In more words: Each Firework has some core task where a calculation setting is critical to that task. For example, a static calculation should always have NSW=0
. We should provide defaults for settings that control calculation accuracy, but we should never forcibly override them.
What are some concrete examples?
NSW=0
and ISMEAR=-5
ISMEAR=0
and EDIFFG=-0.05
.ICHARG=11
, but should not override user-supplied kpoints (#278).I think we can also use this as a guiding principle for designing a calculation superclass, where one of the main common features of the superclass is to provide an inheritable set of arguments to control calculation accuracy.
@bocklund Isn't that what the InputSets are for? If we get the defaults out of Atomate, then as long you define an inputSet you get your settings propagated through.
Right now there's not a 1:1 map between input sets and calculations (see DFPTFW, workflows that use get_wf_deformations
). To me, a 1:1 map would be appropriate and fall under the design principle, but some of the stuff is hidden through custom modifications in atomate of MPStatic set and so on.
@bocklund I would modify a few things with regards to principles:
All atomate Fireworks have a 1:1 correspondence to a PMG input set, unless it is for a type of run that is not of interest to the general public. This means that Atomate should refrain from internal modifications to PMG parameters, unless it is explicit. E.g., a OptimizeFirework should not modify PMG parameters, but a OptimizeVeryAccurateForcesFirework can.
There should be no forcible anything. User wishes are paramount. The PMG Input Sets set reasonable defaults, but if a user wishes to use the MPStaticSet and set NSW to be non-zero, that is up to him. Maybe he has a good reason to. Certainly, we can implement warnings but the user is the ultimate arbiter.
User overrides should propogate to Atomate Fireworks. E.g., things like user_incar_settings, user_kpoints_settings, user_potcar_settings should be supported and passed through to the relevant PMG input set.
@bocklund I have a quick question. It is clear that ISMEAR = -5 is good for accurate energies in metals (no relaxation). What about forces (no relaxation)? Is ISMEAR=1 still preferred or is ISMEAR=-5 fine?
@shyuep In my experience, ISMEAR=-5 is not good enough for accurate forces.
IMO, if you don't know beforehand whether you have a metal or non-metal, you can't do one calculation for both accurate energy and accurate forces. You need to pick one or the other:
If I know a priori whether I have a metal or a not, I can specialize the forces to either use ISMEAR=-5 or ISMEAR=1 for non-metals/metals, respectively.
@bocklund What about non-metals that are metals in DFT? i.e., should have small gap but ends up having zero gap in DFT.
If DFT predicts occupied states at the Fermi energy, then the settings should reflect the behavior of that calculated electronic structure, even if we know that it's not metallic in reality
@bocklund Ok, thanks. I implemented a warning for obvious situations. https://github.com/materialsproject/pymatgen/commit/8c882110b66649dec029f674feb93a332b0baa86
Currently atomate has a lot of default values for parameters in the input sets that are already defined in pymatgen. It would be good to remove those from atomate so that default for any calculation type fully defined in pymatgen reside there. This makes understanding where settings are coming from a bit more transparent.