Closed MarekWlasak closed 4 years ago
That's fine with me if you want to have a go at this.
Oh, and you have one hour! :-)
Really, I don't have a good idea yet of how long that could take.
HI Marek, You're welcome to give that a go.
There's no concept of structured data in the HaloExchange
class though. The basic procedure for current halo-exchange is:
setup(...)
(called only once) --> create a sendmap and a recvmap, which hold indices that need to be added to respectively the mpi-send and mpi-recv buffersIt may be a little confusing because the packing and unpacking of the send- and recv- buffers has 2 implementations depending on availability of CUDA. You can ignore the CUDA implementation, and we can worry about that later.
Thanks @wdeconinck @ytremolet. Realistically, I will be able to start on Monday as there a few thing other things I need to tie up first.
Hi. I think it would be easier for me to work initially from the JCSDA forked version of atlas. @wdeconinck - is that o.k?
Yes. It is not allowed to create branches in the ecmwf/atlas repo, only create branches in a fork and submit a PR. This fork could be any fork, such as the jcsda fork.
Thanks!
@ytremolet @wdeconinck - Just an update ... I have got it to work (without CUDA) and am halfway through building the tests ... I should have a PR ready sometime tomorrow. What are the governance rules for PRs? I presume that someone from ECMWF will need to be requested to review for each PR. Or is there going to be a two step process, with an internal JCSDA review followed by a ECMWF review (when JCSDA atlas repo gets updated / merged / realigned with the main ECMWF repo?
Any PR should be submitted to the ecmwf repo, and branching off the atlas develop branch. I will then assign reviewers. Currently that would be myself. Make sure to have a unit-test that tests your functionality. I will do an internal test to see if the code works across all platforms we use, including Cray XC40.
I have a bit a problem ... I worked off the develop-jcsda branch of the JCSDA atlas fork, because it was significantly more up to date than the forked atlas develop. Comparing my change with the atlas develop shows up Benjamin's changes to PointCloud class. I don't think the JCSDA's forked version of atlas develop can be updated to the latest version without updates to other repos, @benjaminmenetrier am I right? Do I need to wait until JCSDA switches to using develop-jcsda? I am not sure whether I can run the ecmwf's develop on JCSDA's estate.
Also I don't know where to get the appropriate version of gridtools so that I can check whether I have coded that part correctly.
@MarekWlasak At this point you have two options:
It's up to you which route you prefer (unless Willem has other comments), but in any case, the PR should be on the ecmwf repo.
@wdeconinck This is an example where I could help with the review if we could add me as a reviewer. This is really for DA applications so I like to keep on top of what is available to us and I have been writing adjoints for many years which can be useful (although Marek has that experience as well).
Hi @MarekWlasak If you create the PR, I can try myself to get the gridtools-backend working, as long as you have a unit-test I can use. When you do create the PR, you can click a checkbox that allows me to edit your branch directly. As @ytremolet said, the develop-jcsda branch is of temporary nature, and is currently broken. It should not be considered.
Hi @wdeconinck. Thanks for the offer on running the gridtools-backend. I have almost finished the unit-tests. There will be twice the number that you have for the normal halo exchange and they are taking longer to write than the original code.
I started my code branching from JCSDA's develop-jcsda branch. Once I finish these tests, I plan to copy the code into a new branch that is based around the latest ecmwf's atlas develop. I am not sure whether I will be able to run my tests with it though. Do I need the latest versions of eckit and fckit?
I have completed the adjoint tests, barring the ones involving gridtools. I could create a temporary PR with the develop-jcsda branch so that you can see what I have been up to. Would that be of use while I look and see while I can get a branch based on ecmwf atlas develop version to run?
Wow! That was not as hard as I expected it to be. #16 is the associated PR.
Hi. I think that this is something that I could tackle, initially for structured data. Is that o.k @wdeconinck @ytremolet ? Please tell me either way. I won't be offended if you say no. If the answer is yes, then tell me how long I have got.