arohl / gdis

A visualization program for the display, manipulation, and analysis of isolated molecules and periodic structures
GNU General Public License v2.0
43 stars 17 forks source link

Track #23

Closed ovhpa closed 5 years ago

ovhpa commented 5 years ago

Dear Prof. Rohl,

I have finally completed GDIS's tracking feature. This allow to launch USPEX or VASP calculations and track the progress on real time. I have also modified the task system so that several calculations can be launched at the same time. The tracking is a little different for VASP and USPEX though.

For ex. this is two USPEX calculations done simultaneously. The main graph, ALL, shows all structure at one depending on the total energy per atom. In red is the structure that belong to the BESTIndividuals file, ie. that are considered as best by USPEX. In green are the structures generation from the generation currently under calculation. track010 The second graph is the BEST structure (also from the BESTIndividuals file) track011 I keep the BEST graph because it is sometimes tricky to select a structure from the main graphic (which can be very dense). Then is the convex hull graph, which have several new features: track012 As for the main graph, green squares are the structures generation from the generation currently under calculation. But this time, the structures are plotted against a "formation energy". Given compound AxBy, the energy is given as:

Ef = E[AxBy] - ( x/n E[An] + y/m E[Bm] )

Where An and Bm are the minimum energy structure containing only A and only B, respectively, from the USPEX calculation results. It is also possible to use an external references, by using a chem.in file. Its syntax is straightforward, for example, it can contain a line with:

1 2 -16.625834

in which case the reference for the species 1 (the first species in USPEX AtomType order) have 2 atoms and an total energy of -16.625834eV. If the species 1 is A, this means the reference is [A2]. Each species is defined on a single line, and any missing species reference will be taken from USPEX calculation. Additionally, in case no external reference is provided, the graph is first calculated using total energy per atom, until at least one of each single species have been calculated.

As previously, each structure can be select in the graph: track110 And the corresponding structure can be use directly. For example, I decided to use the structure for a VASP calculation: track114 Note that the VASP tracking is different. Each SCF energy is presented in a single graph, in which the diamond symbols correspond to the ionic steps, each link to the corresponding structure.

Each task is register to GDIS task manager, but I modified the code slightly so that each process has it own PID and can be killed independently: track112 This was not possible before due to the synchronous spawning of task. This is a reason why I do them asynchronously (and use the waitpid system function). If this is stable enough, I think it would be nice to change all the calculation related tasks to use this strategy.

Anyway, I think the code is quite stable now, almost ready for the next USPEX workshop - I will do a round of bug hunting before giving the green light ;)

For the future GDIS version, I am very sorry that I have encounter some delays m( )m I will explain the reasons in a separate mail, but it is still ongoing!

Sincerely,

ovhpa commented 5 years ago

I forgot to mention: About VASP tracking, when the vasprun.xml file is overwritten by a new calculation, the tracking is not stopped, which allow to track several VASP calculations that are performed in the same directory. For example in a USPEX calculation using VASP optimization, I have been able to follow all the VASP optimizations from a single GDIS tracking window over more than a week.

arohl commented 5 years ago

Looks great!

I look forward to you committing the changes. Please do change everything to asynchronous if you have the time.

Andrew Professor Andrew Rohl DPhil FACS Director | ARC Training Centre for Transforming Maintenance through Data Science Director | Curtin Institute for Computation Chemistry | School of Molecular and Life Sciences

Curtin University Tel | +61 8 9266 3124 Fax | +61 8 9266 2300 Mobile | 0401 103 620

Email | andrew.rohl@curtin.edu.aumailto:andrew.rohl@curtin.edu.au Web | computation.curtin.edu.auhttps://computation.curtin.edu.au CRICOS Provider Code 00301J

On 31 May 2019, at 9:28 pm, Okadome Valencia notifications@github.com<mailto:notifications@github.com> wrote:

Dear Prof. Rohl,

I have finally completed GDIS's tracking feature. This allow to launch USPEX or VASP calculations and track the progress on real time. I have also modified the task system so that several calculations can be launched at the same time. The tracking is a little different for VASP and USPEX though.

For ex. this is two USPEX calculations done simultaneously. The main graph, ALL, shows all structure at one depending on the total energy per atom. In red is the structure that belong to the BESTIndividuals file, ie. that are considered as best by USPEX. In green are the structures generation from the generation currently under calculation. [track010]https://user-images.githubusercontent.com/36496189/58705778-23a3c780-83eb-11e9-8d67-d3206a829f4e.png The second graph is the BEST structure (also from the BESTIndividuals file) [track011]https://user-images.githubusercontent.com/36496189/58706229-5ef2c600-83ec-11e9-81ea-4fdd64a24d39.png I keep the BEST graph because it is sometimes tricky to select a structure from the main graphic (which can be very dense). Then is the convex hull graph, which have several new features: [track012]https://user-images.githubusercontent.com/36496189/58706318-b002ba00-83ec-11e9-81d5-232c375e547d.png As for the main graph, green squares are the structures generation from the generation currently under calculation. But this time, the structures are plotted against a "formation energy". Given compound AxBy, the energy is given as:

Ef = E[AxBy] - ( x/n E[An] + y/m E[Bm] )

Where An and Bm are the minimum energy structure containing only A and only B, respectively, from the USPEX calculation results. It is also possible to use an external references, by using a chem.in file. Its syntax is straightforward, for example, it can contain a line with:

1 2 -16.625834

in which case the reference for the species 1 (the first species in USPEX AtomType order) have 2 atoms and an total energy of -16.625834eV. If the species 1 is A, this means the reference is [A2]. Each species is defined on a single line, and any missing species reference will be taken from USPEX calculation. Additionally, in case no external reference is provided, the graph is first calculated using total energy per atom, until at least one of each single species have been calculated.

As previously, each structure can be select in the graph: [track110]https://user-images.githubusercontent.com/36496189/58707169-125cba00-83ef-11e9-837d-deb98d19d722.png And the corresponding structure can be use directly. For example, I decided to use the structure for a VASP calculation: [track114]https://user-images.githubusercontent.com/36496189/58707251-4b952a00-83ef-11e9-855b-1b4adbc9cc0a.png Note that the VASP tracking is different. Each SCF energy is presented in a single graph, in which the diamond symbols correspond to the ionic steps, each link to the corresponding structure.

Each task is register to GDIS task manager, but I modified the code slightly so that each process has it own PID and can be killed independently: [track112]https://user-images.githubusercontent.com/36496189/58707620-6ae08700-83f0-11e9-8e8a-25f12efc165c.png This was not possible before due to the synchronous spawning of task. This is a reason why I do them asynchronously (and use the waitpid system function). If this is stable enough, I think it would be nice to change all the calculation related tasks to use this strategy.

Anyway, I think the code is quite stable now, almost ready for the next USPEX workshop - I will do a round of bug hunting before giving the green light ;)

For the future GDIS version, I am very sorry that I have encounter some delays m( )m I will explain the reasons in a separate mail, but it is still ongoing!

Sincerely,


You can view, comment on, or merge this pull request online at:

https://github.com/arohl/gdis/pull/23

Commit Summary

File Changes

Patch Links:

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/arohl/gdis/pull/23?email_source=notifications&email_token=ACTZWLASXIY4D7GNXZXVZM3PYERWPA5CNFSM4HR2XZB2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4GW6U53Q, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ACTZWLBBH67PQC4DEXSBB33PYERWPANCNFSM4HR2XZBQ.

ovhpa commented 5 years ago

I made all task asynchronous. Additionally, I change the display of task status a little. Now the "status file" is updated live, seeking through it for update - if any. I fear some slow down if the status file become very large, but on the other hand it is quite enjoyable to read the status of calculation together with its output: status_2 status

I'm still looking for bugs, and I met with some strange behaviour with CENTOS distribution: the output of the ps command to see the CPU and memory usage of a task is unusable. However, I think we don't have to fix that on GDIS side.

ovhpa commented 5 years ago

I think it is now quite stable: multi_6

I had to (temporary) remove the update of BEST graph though. It was not so stable so I replace with a function that wipe graph data and reconstruct BEST graph for each generation. It's a little slower but much more stable.

I think it is a ready enough version for merge, please tell me if you encounter some problems afterwards.