Closed bast closed 6 years ago
The problem is that the "new gimic" master branch is not working parallel yet. Otherwise I would expect the same efficiency. One has to keep in mind that in the new gimic also still the old cube file type visualization routines are available. So some information gets written out twice.
So in other words once we restore the parallelization there should be no reason to run the old GIMIC, right?
In theory - yes. In practice: - we need to test. Keeping the "old gimic" is good for testing and double checking until we are 100% sure regarding the efficiency. Ideally the "new gimic" should be much faster than the old one.
The stable branch will never ever be removed but I want to know why the new GIMIC is not used. It feels wrong to be developing something that is not used so I want to know precisely why it is not used. This is not a question about keeping the old code as reference just in case - we both agree on that.
I would not say it is not used at all. I need it for the visualizations I am doing but the calculations take forever. Therefore I was happy to write the AUS application to improve the present code. It would be good to have the code operational in parallel. Perhaps @mariavd could also comment on this. I saw that she looked at huge systems with Gimic of up to 300 atoms. That can not have been done using the serial code.
So no need to be afraid of developing a code that is not used.
To condense my question for Maria: is the difference in efficiency only lack of MPI? Is it equally fast/slow on one core?
I have never used the code from the stable
branch, and I have never used MPI on GIMIC. The current profiles are feasible due to poor man's parallelization - running 1000 slices 24 at a time is quite reasonable. Even for the huge systems Kevin and Lukas are working on, where there were up to 1000 atoms.
As to the 3D current density calculations for Paraview, for systems of several hundred atoms they are unfortunately only possible on my computer at home since it is better at single-thread performance than the CPUs on the cluster. I have done them for systems of up to 450 atoms, and they were done within 24 hours, which is fine for me. I cannot remember the number of grid points but it was not a coarse grid.
Thank you. So who is still using the stable branch? You, Heike?
If the question is whether it is reasonable to put effort into parallelization, then YES, definitely. 3D visualisations are important for any system, and people would benefit from running them on a cluster, too.
Yes, parallelization: definitely. I am trying to figure out whether there is more than just missing parallelization.
Yes, I am using the stable branch. After reading Maria's comments I got suspicious. How can it be that things run faster on a laptop as compared to a cluster?
Single-thread performance of both a modern laptop (mine runs a Skylake at nearly constant 3 GHz due to good cooling) and the desktop (4.6 GHz) simply can't match the Xeons on Taito. I expect that the Norwegian cluster has similar CPUs to the Finnish one. I am short on time, but once I have a breath of fresh air I will do the performance profiling.
I am not surprised that laptop is faster. Cluster does not mean fast. Cluster only means many :-) Before we do profiling, already simple timing will help.
This discussion is related to #114 Should we close this issue and continue in #114 ?
If the difference in speed is lack of parallelization, then yes. If the code is also slower on one core, then it's a different issue.
Then we leave the issue open, since we do not know the performance yet.
I don't see a reason why the old code should be faster (apart from parallelization but that is a different issue). If you observe different single-thread performance, please reopen this issue and please convince me with an example input that demonstrates this. Until then we close this.
OK.
I heard that "old" GIMIC (stable branch) is more efficient than recent code (master code). Is this difference only the parallelization part? Or is there also a difference on a single core?