Closed backeb closed 2 years ago
Regarding installing Delft3D FM natively on GRNET's HPC, we have decided not to do so and only use the Delft3D FM Singularity container. The reasons for this are
We will work on creating a recipe for installing the CMEMS MOTU API & CDS API client and set up a cron job to automatically download the necessary data. Activity for the providers:
We will work on generalising the pre- and post-processing containers. To do so we will need workflow tooling, namely:
cc @kkoumantaros @cchatzikyriakou @sebastian-luna-valero @enolfc @sustr4
1b. i) @avgils register at https://sram.surf.nl/ to the C-SCALE-test-co and also add your public ssh key there ii) then, send an e-mail to support@hpc.grnet.gr with your IP pool, which will be used to access ARIS
2b. The VM operators/users have root access to perform all these actions. Is there a particular activity that only providers are able to implement?
PS: Data resources (CPU and Storage) per project//use case//user should be defined in the SRAM-LDAP soon.
cc @yan0s @ntellgrnet @kkoumantaros
- Sure, no problem for us to run it in Singularity. If single node (one machine) runs are performed probably similar performance will occur. However, have in mind that no full HPC capabilities are taken into account since Delft3D FM could run on many nodes (MPI standard). Now, if you run the application in many nodes but through Singularity, no good performance will gain since the container has no knowledge of the network layer and network technology (eth., infiniband etc.).
Thanks, in discussions yesterday with our team working on Singularity, I was told that there are some performance differences if you use the MPI library inside the Singularity container vs telling the Singularity container to use the MPI library installed on the HPC. The latter requires a configuration step telling the container where the MPI library is installed on the HPC.
Perhaps we can test those two scenarios and evaluate their impact on performance.
2b. The VM operators/users have root access to perform all these actions. Is there a particular activity that only providers are able to implement?
So if the objective here is to have a TOSCA template / image of the CMEMS MOTU client that downloads to an NFS server accessible by multiple VMs / the login node of an HPC, we can install the CMEMS MOTU client, set up the cron job for automatic downloads and provide that as an example, but you will have to figure out how you want to provide that as a service for other users.
PS: Data resources (CPU and Storage) per project//use case//user should be defined in the SRAM-LDAP soon.
I expect the providers will inform us of the workflow once this is ready?
One more comment on the Cloud - HPC interconnection: The storage facilities and policies in GRNET declare separate storage between the two infrastructures (Cloud & HPC). For this reason, the data produced from the pre-processing that need to be passed to the Delft3D FM, and vice versa, must be transferred via the SSH protocol. Thus, SSH-like tools should be used (scp, rsync over SSH, etc.).
https://confluence.egi.eu/pages/viewpage.action?pageId=103161695
Hi,
I don't have permissions to edit the post, but here is the link to the "how to get access" for the GRNET HPC so far: https://confluence.egi.eu/display/CSCALE/Use+case%3A+HiSea#Usecase:HiSea-Howtogetaccess
I have also resent the invites to @avgils and @sandragaytan to join the HiSea CO in SRAM.
@sebastian-luna-valero : @avgils and I followed the steps, uploaded the public SSH to SRAM profile and emailed the IP (Deltares range) to support@hpc.grnet.gr. Awaiting reply and instructions
Thanks @lorincmeszaros
Next step is to wait for support@hpc.grnet.gr to confirm access and follow their instructions.
Happy to help over here if you find issues.
The below architecture options could be addressed in individual sprints:
Sprint 1: test option 1 and develop option 2
Dates: 4 - 8 Oct 2021
Data requirements and specs:
Put info for above 3 tasks here: https://confluence.egi.eu/x/Xx8mBg
Automate data downloads on provider side
- [ ] Instantiate another VM and work on automation (@backeb @sandragaytan)Documentation
Performance testing
Document results here: https://confluence.egi.eu/x/Xx8mBg