Still running into problems with NBLOCK

gabkrenzer commented 3 years ago

Hi Abel,

I have been doing a few tests on DynaPhoPy with my data but my results do not make sense, I think NBLOCK might be the problem. Just a summary of the outcome of our last exchange, I re-wrote my XDATCAR in a sequential way, from 1 to 996 in steps of 1, even though my NBLOCK is 50, and we concluded that this file with a timestep_per_block = timestep*NBLOCK should work.

But first of all, when I run the Maxwell-Boltzmann Distribution Analysis, it gives me a temperature fit of 6.6K, but I had a quick glance at my OSZICAR files and the temperaturestays around 300K, and for sure it does not settle down at 6.6K. But I anticipate this is due to my use of NBLOCK since this calculation is dependent on the timestep and my timestep is the timestep per block, which is significantly larger but atoms are likely to move periodically and therefore from structure to structure it is likely that there isn't much change, hence reducing the fitted temperature of the system.

I therefore run dynaphopy on a previous run on 32 atoms I did with a timestep of 1fs for 1,000 steps with NBLOCK=1, and this time I got a much more sensible temperature, 287.8K.

The more worrying thing about this is that I wonder if DynaPhoPy is therefore also having trouble interpreting the velocities (since it calculates it from the coordinates of the trajectory), which is at the core of your model and of your technique to calculate the power spectra. So, I wonder if NBLOCK could screw the whole calculations basically.

What reinforces this idea is that the Maxwell Boltzmann Distribution Analysis is not the only feature misbehaving. I have run a peak analysis of the phonon projected power spectrum, and I also get weird results. Here is a snapshot of the output for one mode and its corresponding plot, but in summary

temperature is again misbehaving but that's not very surprising as it's also having trouble in the Maxwell-Boltzmann distribution analysis -fitting errors are ridiculously large, but not surprising since I get so many peaks
frequency shifts are also ridiculously large
I get many peaks when I expect one

But not even going into that much details, the power spectrum projected onto the gamma point is also not looking good, as I expect it to give me peaks around the frequency of each of the 12 modes present, and I only get 3 here...

I tried the peak analysis on my previous run and it looks somewhat better, at least for the fitting error, even though the frequency still looks wrong.

But I suspect this is mainly due to the fact that I only use 0.85ps worth of simulation and 32 atoms, which means I also use much less coefficients for the algorithm used to compute the power spectra from the velocities since there is a requirement that nb_of_step>nb_of_coeff, even though I cannot usew that many for the other simulations either. But in my opinion, it is quite striking to see that the fitting error is smaller for my 0.85ps run with 32 atoms than for my 100ps run with 256 atoms... So in my opinion there imight be an issue in the code interpretation of NBLOCK and maybe it simply cannot work with an NBLOCK different than 1. But I might doing another mistakes.

Let me know your thoughts on this,

Gabriel

abelcarreras commented 3 years ago

Hi Gabriel,

The issue with NBLOCK is strange. This only affects at the time of reading the trajectory by multiplying the time step that you have set "-ts" by the NBLOCK (by reading the block numbers in XDATCAR). The rest of the program just uses this timestep for all calculations. I just tested by reading a XDATCAR file modified by multiplying block numbers by 10 and setting a timestep 10 times smaller. I got the same results as using the original XDATCAR and timestep.

The reason that you get an unsuccessful fitting could be for several reasons: 1) The order of atoms in the supercell is incorrect: Dynaphopy expects a particular order of the atoms when constructing the supercell. This is crucial in order to know which atoms are equivalent to which by translational symmetry. If you used dynaphopy itself to create the supercell for the original unit cell you should be safe.

2) The phonon eigenvectors (obtained from the provided force constants) do not properly describe the vibrations of atoms within your MD simulation (The projection into phonon modes is wrong).

If the projection into phonon modes is correct you should observe a single peak for each phonon mode. Fitting global error is only meaningful if only one peak is present.

Fit temperature calculation is not the same as in Bolzmann analysis. While Boltzmann analysis is calculated directly from the atomic velocities, so it will "always work" independently of the model/anharmonicity/etc. Fit temperature is calculated from the area bellow the curve of the fitted phonon mode projected spectrum. For this reason if the fitting is unsuccessful or the system is highly anharmonic (so the model cannot be applied because the spectrum shape is far from a Lorentzian function) the calculated temperature may be very different from the MD temperature.

gabkrenzer commented 3 years ago

Hi Abel,

Sorry for the late reply, I have been very busy over the past few days.

Ok, so I guess the problem is not coming from NBLOCK then, especially since both simulations, NBLOCK = 1 and NBLOCK = 50, give several peaks in their power spectrum. Even though, in my XDATCAR, the structures are written in steps of 1, instead of 50, so Direct Configuration = {1, 2, 3, ..., N}, and not DIrect Configuration = {50, 100, 150, ..., N}.

Following your suggestions,

I visualised my trajectories more carefully, they look well-behaved and I recover recover one phonon mode identified from eigenvectors obtained with phonopy, the imaginary mode - could the anharmonictity associated bring problems? Also, at higher T, I should expect highly anharmonic behaviour to occur, but so far I have just tried at 300K, and it should be fine. But I am curious to know what is the spectrum shape in the case of a highly anharmonic system?
My supercells were generated using phonopy, so I guess that should be ok?
That leaves the option that the phonon eigenvectors in my FORCE_SETS do not match the MD vibrations. I would be very surprised if it was the case as I properly relaxed and converged my parameters, but it could be the case. I did use different parameters to reproduce a study that was using different parameters and I obtained a slightly different phonon dispersion, but not that different when you consider how different the parameters are. Still I don't know if that would be enough to create a mismatch between the FORCE_SETS and the MD. I have attached the two superimposed phonon dispersion. The parameters are slightly different, mine are E_cutoff = 950eV, k_mesh = 4x4x4 for the relaxation, then supercell = 2x2x2 for the phonons and corresponding parameters; theirs are E_cutoff = 520eV, k_mesh = 20x20x16, and supercell = 3x3x3. Unfortunately because it is a 3x3x3 supercell I cannot use to check because my MD supercell is 4x4x4. But I will try other parameters to see if that might be the case. myDFTvDFTpaper.pdf
Another point that might be worth mentioning is that I tried to use my QHA data using qha_extract, but I ran into a problem. Nevertheless, the issue does not seem linked to the fact my FORCE_SETS would be wrong, it looks like it is a different issue. The error message I obtain is
```
File "/home/gkrenzer/.local/lib/python3.8/site-packages/dynaphopy-1.17.6-py3.8-linux-x86_64.egg/EGG-INFO/scripts/qha_extract", line 101, in <module>
qha_temperatures = phonopy_qha._qha._temperatures[:phonopy_qha._qha._max_t_index]
AttributeError: 'QHA' object has no attribute '_max_t_index'
```
I am not 100% clear what that line does but it is part of the code where it calls for phonopy_qha (https://github.com/abelcarreras/DynaPhoPy/blob/master/scripts/qha_extract#L86-L101). So I went to look into https://github.com/phonopy/phonopy/blob/develop/phonopy/qha/core.py and https://github.com/phonopy/phonopy/blob/develop/scripts/phonopy-qha, and in https://github.com/phonopy/phonopy/blob/develop/phonopy/qha/core.py#L122 they do use a _t_max property, but I never couldn't find any _max_t_index in there... Could it be an issue with the qha_extract code? Or is it an issue with my files?

Gabriel

abelcarreras commented 3 years ago

Dear Gabriel,

If you used phonopy to generate the sueprcell then it should be fine since dynaphopy uses the same atoms order. One way to check this would be to plot the atomic displacements distribution using dynaphopy. If the order is correct you should see a nice normal distribution around the equilibrium positions (0 in these plots). Since this distribution includes all atoms equivalent by translational symmetry if something is wrong with the expected position of some atom it will show in the plot as multiple peaks.

Point 3 could be a problem, if you used a different unit cell for the MD and phonon force constants calculation then the eigenvectors can be misaligned. This may produce that the phonon mode projections have contributions of other modes in the power spectrum showing multiples peaks. This can be specially serious if your system is complicated. Even frequencies may be similar, eigenvectors can change a lot.

I haven't used qha_extract for a long time, it is perfectly possible that it is no longer compatible with the latest version of phonopy. I will check it.

gabkrenzer commented 3 years ago

Dear Abel,

Thank you for the tip to check the supercell ordering. I will sanity check that and will let you know.

I have used the same unit cell for my DFT phonon calculations and for aiMD. By the way, my system is really not complex at all, 4 atoms in the primitive cell. I was just highlighting that different parameters used for the realaxation and the phonon calculations - from a paper - that also seem reasonably converged gave a different phonon dispersion. And I wondered whether such a difference could be explaining why the eigenvectors and the MD vibrations do not match, potentially highlighting that my phonon calculations are not accurate enough and hence od not match the MD vibrations? Potentially, I could use a larger supercell for my DFT phonons and see if it changes something? E.g. instead of 2x2x2 supercell, use a 4x4x4 supercell. The only thing is that I used quite a low mesh sampling for my relaxation, 4x4x4, so that would mean for consistency I'd use a 1x1x1 sampling for 4x4x4 supercell phonon calculations. But that could still be something to try as I would still use the same unit cell and I wouldn't have to redo the MD calculations, which are the most time consuming calculations. I also wanted to check with the FORCE_SETS obtained with the paper parameters, but since the unit cells are different, that won't make for a good comparison, plus supercells are not multiples of each other, so that won't work either. Otherwise, if nothing works there, I guess that mean I potentially need to redo my phonons, use higher accuracy parameters, and redo my MD with the corresponding unit cell?

Is it common for dynaphopy users to have phonon eigenvectors not matching MD vibrations?

Another thing I was thinking to try is I could use the dynaphopy harmonic eigenvectors feature. Does it represent the eigenvectors for the MD vibrations? If yes, then I could compare it to the eigenvectors obtained from my phonopy calculations and see if they really are not matching. If the dynaphopy feature represent the eigenvectors from the DFT calculations then that won't be of any use.

Thank you for having a look at qha_extract and taking the time to look at this issue.

Gabriel

abelcarreras commented 3 years ago

Dear Gabriel,

From my experience, the best way to work is use the same supercell size in the phonon calculation and the MD. So if you are using a 2x2x2 supercell for MD, just use the same to calculate the harmonic phonon force constants. Also use the same methodology for both calculations for best results.

Usually the method implemented in dynaphopy works pretty well, specially in high symmetry (non complex) systems, most issues come for the order of atoms. Unfortunately I haven't found a reliable algorithm to automatically detect the order and reorder the atoms accordingly. As I suggested calculating the atomic displacements with dyhanphopy can be a good way to see if the atoms order is correct.

gabkrenzer commented 3 years ago

Dear Abel,

Ok. I am using a 2x2x2 super cell for my phonon calculations, and a 4x4x4 supercell for my MD simulation. I am currently running phonon calculations with a 4x4x4 supercell. Hopefully the new FORCE_SETS will work.

I have calculated the atomic displacements with dynaphopy and I get well behaved Gaussians at 300K, and 400K. At 678K, however, I get multiple peaks. Is that common at higher temperature - in you paper you go as high as 1400K though ? Nevertheless, I expect a superionic transition there and anharmonicity should kick in massively at this temperature, so potentially not a bad sign. But does that mean that at 678K, dynaphopy will struggle with other calculations? Anyway, at 300K and 400K it should work, but I ran another check and the phonon mode projected power spectrum are completely wrong. So it looks like the only thing that is left is a problem with my FORCE_SETS? Will see how the 4x4x4 supercell will do.

Gabriel

abelcarreras commented 3 years ago

Dear Gabriel,

If you get multiple peaks at high temperature while at low temperature is well behaved then, maybe it just means that at such high temperature your crystal structure is just destroyed. Have you checked the MD using some kind of molecular visualization software? If the crystal is almost melt and the atoms just move around the whole supercell, the aharmonic approach used in dynaphopy will not apply.

abelcarreras commented 3 years ago

Also if you have a phase transition, so the equilibrium position of atoms change respect to the original crystal structure you will also obtain multiple peaks. In that case you will need to use this high temperature crystal structure as dynaphopy input unit cell.

gabkrenzer commented 3 years ago

Dear Abel,

Sorry for the late reply, I had a week-long training course last week.

I have had troubles converging my 4x4x4 phonon calculations, but I will keep you updated once I have generated my force constants from them and use the FORCE_SETS with my MD data.

Concering the atomic displacement distribution at higher temperatures, suprisingly the inter-atomic distance is still respected about 2.085AA, which the Li-Li distance of my equilbrium structure. Plus, it looks like that each individual peak follows a Gaussian distribution. Since the integral over all peaks is constant, I think it just shows that the original site is depopulated, but the crystalline order remains, which is to be expected at higher temperature from transition state theory. I have visualised my trajectory too and the structure looks ordered but some ions perform jumps from tile to time, so that would confirm it.

Gabriel

abelcarreras commented 3 years ago

I agree, this is probably the case. Unfortunately in this situation dynaphopy is not able to use crystal symmetry to calculate the proper average power spectrum. However you may still be able to get some results by defining the unit cell and primitive cell as your supercell. You will get lots of power spectra with a lot of statistical noise but its something. Maybe even you can try to average them by hand.

gabkrenzer commented 3 years ago

Hi Abel,

First of all, sorry for the late reply, I was waiting to get my phonon calculations to converge with the 4x4x4 supercell before replying. I have just tried the corresponding FORCE_SETS in Dynaphopy, but it still does not look great, even at temperatures where the crystal structure is still well-behaved and I get a Gaussian for the atomic displacements (300K).

That's the power spectrum I get when projected onto the Gamma point. The peak analysis gives horrendous results too. At this point, I am not sure what to try.

Gabriel

abelcarreras commented 3 years ago

That is very weird. What about the full power spectrum?

gabkrenzer commented 3 years ago

The full power spectrum looks like this. It is not matching the DoS even remotely and really just looks like noise, even weirder is that it carries on after 21THz, which is roughly the largest frequency of the highest energy optical mode in the Brillouin zone.

Gabriel

abelcarreras commented 3 years ago

This I don't understand, basically the full power spectrum is just calculating the Fourier transform of the velocity autocorrelation function. This do not depend on the force constants nor atoms order. According to the velocity distribution you previously posted this was Ok right? This means that the velocity is calculated properly. Have you tried to change the power spectrum calculation method? have you skipped the firsts time steps of the MD simulation to ensure it is properly stabilized?

gabkrenzer commented 3 years ago

Hi Abel,

The thing is, my velocity distribution did not look right in my first message. If you look back, you can see the velocitiy distribution seemed to be dependent on NBLOCK. For a 300K aiMD runs I was getting 287.8K for NBLOCK=1 and 6.6K for NBLOCK = 50.

Re-calculating the velocity distribution I got something different than last time and I realised I used the MD time step - 2fs - instead of the time step per NBLOCK - 100fs - for the last full power spectrum I sent you. I re-calculated the full power spectrum and I got this, which is surprisingly periodic. So I tried with my short run on NBLOCK = 1, and I got this using the MEM method, which is not good, but at least stops around the right frequency. Also, it is a much shorter run, on a smaller supercell, and on less coefficients for the MEM, so could that explain why it is not great?

Now, for the other methods the Fourier Transform one does not work and I get Segmentation fault (core dumped). FFTs, however, work, and I get this, which is still bad. FFT_numpyNBLOCK50 When I use FFTs with my short run on NBLOCK=1, however, I get this, which is not great, but at least it sort of follows the shape of the DOS and you can imagine that if you gather more data you'll eventually get there.

So I can't help but think that there is a problem with NBLOCK. I cannot explain it otherwise, that I get better results with a 2x2x2 supercell over 1ps than with a 4x4x4 supercell over 100ps... and it looks like the problem comes from the velocity distribution in the first place.

Let me know your thoughts, Gabriel

abelcarreras commented 3 years ago

Hi Gabriel,

At least this makes more sense. In principle you should use the time step you used for the simulation (not the time step per block), dynaphopy automatically multiplies this timestep by NBLOCK to get the timestep per block. That why the frequency scale is messed up. Still, I'm not saying that there is no bugs in NBLOCK treatment. To test what is going wrong can you check different timesteps to find which one gives you the correct frequencies? You can check it with the velocity distribution of the full power spectrum. It should be some integer multiple of the timestep. This can help me to guess where is the problem.

The periodicity in the first PS is not surprising, the FFT is periodic and only approximates the "real" Fourier Transform within the Nyquist interval. This just indicates that there is some problem with the time scale. For the calculation of PS use FFT, you may have too few steps for MEM and direct method for FT was just for testing in early development and it may be broken now.

gabkrenzer commented 3 years ago

Hi Abel,

Thank you for your reply, it is really insightful.

Originally we thought it would be good to set -ts as the timestep per NBLOCK since when I concatenated my XDATCARs, they were written in steps of 1, instead of steps of 50 - I couldn't find a way of doing steps of 50 using pymatgen.
Yes, since I have used NBLOCK=50, I only have 996 steps instead of the 50,000 steps I would have had with NBLOCK=1, so I usually set the number of coefficients to 980 since it has to be inferior to the number of steps.

Now, I thought maybe I could look back at the guess dynaphopy gave when I gave it my old XDATCAR, which was concatenated in steps of 50, but in a non-sequential way - something like 50, 100, ..., 8300, 50, 100, 8450... etc - since I did several runs due to walltime. When I set -ts 0.002 on that file, so my MD time step, dynaphopy then gave me a time step of 0.01778893, which turns out to give a velocity distribution that is not bad since I get a temperature of 207.9K, even though I would expect something more around 300K. With that I get this power spectrum using FFTs: I also tried with MEM to see if maybe the periodic behaviour was gone - potentially suggesting the time scale was right? - but It now goes back to something like this...

I get the same thing if I set -ts 0.01778893 on my sequential file - so 1, 2, 3, ..., 996. So I tried to set -ts 0.015, which gave me 292.4K, which looks more like it, but 0.015ps is not a multiple of 0.002ps, so I tried 0.014ps and I got 335.7K. So I tried a PS with 0.014ps and I got this:

So both PS computed using FFTs look roughly the same, and they are still not doing great...

I also tried setting the -ts 0.002 on my sequential XDATCAR just to try and this time I get 16448.9K and the frequencies of the PS are hence completely wrong. But in the interactive mode the time step is still 0.002ps, so I guess, the XDATCAR file is written in steps of 1, dynaphopy does not realise NBLOCK=50.

Potentially, I should investigate further to write an XDATCAR written in steps of 50, not 1, in a sequential way? Would that potentially change something?

Gabriel

abelcarreras commented 3 years ago

Dynaphopy relies only in the information written in XDATCAR to get the NBLOCK, if this is messed up then dynaphopy will not understand it properly. This means that it needs block information in all steps in XDATCAR to be sequential and properly spaced (all of them) [50, 100, 150, 200, ..etc]. This is the information that it uses to properly modify the timestep provided by -ts (in this case multiply by 50). At the end the information you get in the interactive mode is an average of all timesteps, which in principle should be a multiple of the provided timestep if all steps are equally spaced and sequential. If it is not then it may be some problem in the XDATCAR.

abelcarreras commented 3 years ago

by the way have you tried concath5 utility in dynaphopy scripts folder?

https://abelcarreras.github.io/DynaPhoPy/special.html

this may be useful if you are interested in merging several MD trajectories.

gabkrenzer commented 3 years ago

I see. I will look into concath5 to concatenate my files nicely and I will get back to you with what I get, sounds good. I also did not know you could run several independent MD simulations at the same time on VASP and then use these to get a more complete trajectory, instead of restarting them at the end of each run - which is what I did -, that would save so so much time. But I guess if you then want to analyse things like diffusion mechanisms, then you are limited?

gabkrenzer commented 3 years ago

Also, since I have restarted from the last velocities, only the first run is equilibrating and the others aren't. I see you can define the number of time steps to be skipped for each runs with the flag -f, but is it possible to only do that from the first run? One option could be to just manullay take out some steps from the first run and then set -f to zero.

abelcarreras commented 3 years ago

Yes, it is very possible to concatenate several independent MD trajectories to obtain a better PS. The only issue is the calculation of the velocity (from positions) at the border between trajectories but this represents a very small effect since the trajectories are much longer. Also, you can use dynaphopy option --read_from to generate the first hdf5 file and concatenate with the other using -f zero for example.

gabkrenzer commented 3 years ago

Hi Abel, so I have concatenated my trajectory using concath5 but I still get exactly the same results than when I used my full trajectory XDATCAR in steps of 1 and then specifying the time step as the time step per block. So that was not the issue. It looks like that as long as you write them in sequential order then either you have the right spacing and dynaphopy finds the time step itself, or you have the wrong spacing and you just tell it yourself the time step per block. But it works the same either way.

The only thing that changes now is that when I saved my XDATCAR files as hdf5 files I specified the MD time step so now when I run dynaphopy it does not need me saying anything about the time step and then it computed it right and displays the time step per block in the interactive window. So that suggests the trajectory files are now all nicely written, but it does not solve the problem. So the problem is elsewhere.

Something I realised that may be of interest is that my last trajectory file was only one block corresponding to 40 steps because I wanted to get exactly 50,000 ionic steps in total. But when I tried to save it as a hdf5 file I got the error: ValueError: Shape of array too small to calculate a numerical gradient, at least (edge_order + 1) elements are required. So would that mean that it only uses the value in the block and does not consider that it corresponds to actually 40 steps? Or maybe it does, but once it has more than one block available to compute the velocities.

gabkrenzer commented 3 years ago

So for instance for my 300K trajectory, I still get 6.8K as the temperature fit for the M-B analysis, when this is what I get when I plot the evolution of temperature from the OSZICARs. thermal_oscillations_300

abelcarreras commented 3 years ago

If I understood correctly I think the problem is that you are using a trajectory with just one step (one block if you want) so it is impossible for dynaphopy to compute the velocities from it. When you store the hdf5 file dynaphopy actually does al the process of calculating the velocities from the trajectory you provided, that's why it fails. For combining several trajectories, all of them should be long enough to neglect the errors in velocity calculation in the limits (first step and last step).

abelcarreras commented 3 years ago

Also you cannot combine trajectories with different time step (or different NBLOCK)

abelcarreras commented 3 years ago

Also, if the problem is in the time step (so you get a good shape in the MB distribution) you can try to change it by hand until you get a MB distribution that fits to 300 K, this will give us some insights about what is going wrong.

gabkrenzer commented 3 years ago

Ok, that makes sense. But all of my other trajectories - there are six others - are consistent - same time step, same NBLOCK - and are each about 8000 steps, so about 160 blocks each. So maybe then 160 is not enough? But the statistics in those 160 blocks corresponds to 8000 steps, so you should get the same result than for 8000 single steps, right? But looking at how much better it is on 1000 ionic steps with NBLOCK=1, maybe you don't?

gabkrenzer commented 3 years ago

Whatever -ts I set for the hdf5 files, it always returns 0.1ps in the interactive window, so it looks like it is set when writing the files and I cannot change it.

As I mentioned earlier, setting a timestep of 0.015ps for the XDATCAR written in steps of 1, so 0.015ps would be the time step per block, gives a temperature fit of 292.5K, but then the power spectrum still does not look convincing - see the PS above. So I suspect that the problem is not lying there - or at least not entirely.

abelcarreras commented 3 years ago

Unfortunately not, the statistics of 160 blocks is not equivalent to 8000 steps, just 160 points. The sampling may be better but the number of points for statistics is just 160. Specially when we are transforming them to velocities to get some time information. Less sampling (less points) in your MD will lead to less resolution in your Fourier transform and therefore less resolution in your Power spectrum

gabkrenzer commented 3 years ago

So could that be the reason why I get much worse results with a superior NBLOCK? Would you then recommed in the future to use NBLOCK=1 when using dynaphopy?

But coming back to our problem. Still, on the total trajectory I get roughly 1000 points with NBLOCK=50, so I should at least get the same thing as 800 points with NBLOCK=1, no? And even then, I don't see why I get a temperature fit of roughly 6K. Even if the results are less accurate than I thought they would be, I should still get roughly 300K at the temperature fit and something comparable to my data of 800 points and NBLOCK=1, if not better.

abelcarreras commented 3 years ago

Maybe the problem is in the calculation of the velocities, if the atoms of subsequent blocks are too far away the precision in the velocity will be worse. Take in account that the velocities are calculated from the positions. Using NBLOCK or not is up to you but unless your timestep is very small I would recommend to use small NBLOCK. The "real" timestep for dynaphopy is timestep*NBLOCK, you should try to make this value reasonable to obtain good velocities. Another thing would be if you provided the velocities directly (which can only be done with lammps right now), then this may not be so important.

gabkrenzer commented 3 years ago

I see, yes, that's what I was thinking. So I guess NBLOCK=50 with a time step of 2fs was probably a bit too much for dynaphopy. So would you just recommend redoing the MD simulations then?

Yes, I can imagine that if you provide the velocities then you would probably get away with it.

gabkrenzer commented 3 years ago

Because I guess dynaphopy cannot make much of that data?

abelcarreras commented 3 years ago

Yeah, probably it is better to redo the MD with smaller NBLOCK. At the end, the most data points you have the better PS you get. So, if you have no issues with storage space maybe it is better to just write all steps. Keep in mind that the range of frequencies that you can analyze will depend on that "real" timestep you have.

gabkrenzer commented 3 years ago

Ok, I will do that then. I can always save them as hdf5 files later on. Thank you for the help!

abelcarreras commented 3 years ago

Thanks for your feedback!

gabkrenzer commented 3 years ago

Hi Abel,

I was discussing the outcome of our discussion with Dr. Ben Morgan, and he told me he actually wrote a VASP patch that writes out the actual velocities from VASP and therefore you don't need to estimate them from postitions - you told me earlier you did not have such an option for VASP, only for LAMMPS. So it might be useful to you and enable future users to use larger NBLOCK: https://gist.github.com/bjmorgan/1f049a0ebf9aeb9058c8d672f235e6e5

Gabriel

abelcarreras commented 3 years ago

Thanks Gabriel,

I will take a look at it

gabkrenzer commented 3 years ago

Hi Abel,

Sorry to bother you again, but I am currently setting my calculations to run several independant MD simulations at the same time, but the VASP documentation is not exactly clear on what to do and none of the members in my group have tried this before. Do you simply set normal MD simulations and choose different RANDOM_SEED so that the random number generator is different and they start on different velocities, etc...? Or do you also need to set biased MD, constrained MD, or metadynamics, to make sure to direct each trajectory in some way that it is definitely different from all the others? But I guess that would then give you biased statistics, so that would probably not be that good.

Thanks, Gabriel

abelcarreras commented 3 years ago

Hi Gabriel,

Yes, you set different a random seed for each calculation and you are good to go. The point here is to ensure you get different trajectories for each calculation. If all trajectories are the same you won't get you new information and the PS will be the same as using just one trajectory. You may get the same effect by changing initial velocities or modify slightly the initial positions of atoms.

gabkrenzer commented 3 years ago

Great, thank you Abel!

gabkrenzer commented 3 years ago

Hi Abel,

Sorry to bother you again. So I have run my MD simulations using different random seeds using NBLOCK = 1 this time, but I still do not get very convincing results for my power spectrum...

This one was done with the entropy method using 2000 coefficients: full_spectrum_300K_entropy

This one was done with FFTs and I have included a negative region, which peaks a lot suprisingly. Even though there is an imaginary mode in the phonon dispersion - can see it on the DOS - that is very active when I visualise my simulations, I would not have expected such a peak. But to be honest, it looks wrong all together. The good thing though is that entropy and FFT methods are consistent with each other, - the two positive parts are the same - so would it say something about the MD data, the system at hand? full_spectrum_300K_FFT

Now this last one I wanted to include negative frequencies using the entropy but I just go a line: full_spectrum_300K_entropy_weird

Gabriel

abelcarreras commented 3 years ago

Hi Gabriel,

The point is that the negative frequency region in the power spectrum is probably not what you think. Those are not imaginary frequencies. I can't remember well with mathematical rigor but they have to do with the two polarization signs in the Fourier transformation. Therefore these frequencies have to be taken as absolute values and can be added up the the positive region. If the signal is real (such as the velocities in this case) the two regions (positive and negative) are symmetric (as you observe in the FFT). Maximum entropy method is not defined for negative region. In dynaphopy I sum up the two regions all together in FFT to get the proper integration (full power) of the full power spectrum in the positive region which correspond to the total kinetic energy. In summary: just ignore the negative region, it contains no useful information.

gabkrenzer commented 3 years ago

Hi Abel,

I see! Thanks a lot for that Abel, that is really helpful! So do you think my power spectra looks good overall? When I look at what you obtain for Si, I am not even remotely close to reproduce my harmonic DOS as well.

abelcarreras commented 3 years ago

Well, silicon has a lot of symmetry so it is to be expected to converge faster. The range of frequencies you obtain from MD are more or less the same as the harmonic DOS, this looks good. Now you should check the peak analysis or compare the phonon band structure with the harmonic one. If you are also interested in the linewidths this will take even much longer MD trajectories to converge. There is no way to now exactly how long the MD should be, it is about checking with different number of steps and compare when you converge the property that you are interested in.

gabkrenzer commented 3 years ago

I see, thanks a lot Abel, that's so helpful! Thank you for taking the time!

gabkrenzer commented 3 years ago

Hi Abel,

So I compared the renormalised phonon dispersion with my harmonic dispersion. I did not work well when I use my FORCE_SETS obtained using a 4x4x4 supercell. But I think it was just because dynaphopy was trying to get then values a 64 k-points, which was a bit too much for the accuracy of my simulation and it got it wrong in the interpolation. renormalized_phonons4x4x4

It worked well when using FORCE_SETS obtained with a 2x2x2 supercell. Much better actually and I feel confident in the results for frequency shifts. renormalized_phonons2x2x2

So I tried to do that using my QHA data but again I am running into the same problem as before when I try to use qha_exctract. Going back to what I mentioned last time:

The error is : File "/home/gkrenzer/.local/lib/python3.8/site-packages/dynaphopy-1.17.6-py3.8-linux-x86_64.egg/EGG-INFO/scripts/qha_extract", line 101, in qha_temperatures = phonopy_qha._qha._temperatures[:phonopy_qha._qha._max_t_index] AttributeError: 'QHA' object has no attribute '_max_t_index' I am not 100% clear what that line does but it is part of the code where it calls for phonopy_qha (https://github.com/abelcarreras/DynaPhoPy/blob/master/scripts/qha_extract#L86-L101). So I went to look into https://github.com/phonopy/phonopy/blob/develop/phonopy/qha/core.py and https://github.com/phonopy/phonopy/blob/develop/scripts/phonopy-qha, and in https://github.com/phonopy/phonopy/blob/develop/phonopy/qha/core.py#L122 they do use a _t_max property, but I never couldn't find any _max_t_index in there... Could it be an issue with the qha_extract code? Or is it an issue with my files?

Regarding linewidths convergence, I have done a MEM coefficient analysis, but I couldn't get it go higher than 1000 coefficients, but all - apart from 2 modes - looked like they were converged, or converging soon so I tried a peak analysis with 2000 coefficients. I got single peaks for all modes so I was very happy - apart from acoustic modes where I get 2 peaks and non-zero frequencies at the gamma point - around 14THz -, which is a bit weird, especially because I get a great match when looking at the phonon dispersion. Looking at the phonon dispersion with linewidth looks sensible too. Following your advice, it might still be wise to carry out further convergence tests for linewidth with respect to MD run time. I would like to automate that in python so I wondered if there was a dynaphopy library available too? Or at least is it possbile to save the peak analysis in a file, it looks you can only do that with -sp for one phonon mode. That would very useful to extract linewidth values at different temperatures too and do a plot of mode linewidth versus temperature for a potential publication - very much like what you did in the dynaphopy paper.

Best Wishes, Gabriel

abelcarreras commented 3 years ago

Hi Gabriel,

Your phonon band structure looks nice. There is a dynaphopy python API that you can use. This script contains a simple example:

https://github.com/abelcarreras/DynaPhoPy/blob/master/examples/api_scripts/script_silicon.py

In this example it just calls high level functions to plot data but if you check the code of these functions I think that you can figure out how to get the raw data (if not let me know).

About QHA, yeah, sorry about that I didn't have time to fix it but apparently it is no longer compatible with latest version of phonopy. I will try to find some time next week to fix this script.

abelcarreras commented 3 years ago

also you can check test scripts in unittest folder

abelcarreras / DynaPhoPy

Still running into problems with NBLOCK #12