c-scale-community / use-case-waterwatch

Apache License 2.0
0 stars 0 forks source link

C-SCALE computing resources: take them or lose them! #10

Open sebastian-luna-valero opened 1 year ago

sebastian-luna-valero commented 1 year ago

Hi!

This use case currently has the following resources allocated:

Do you think that's enough for the next 9 months? C-SCALE currently has spare capacity that can be allocated now, so if you need to scale up, please let us know asap!

If we don't hear from you by Friday 30th Sept we will assume that you don't need more capacity and we will then reuse the spare capacity in C-SCALE for other use cases.

On the other hand, if you no longer need C-SCALE computing resources, please let us know as well as they will be reused properly.

Best regards, C-SCALE

sebastian-luna-valero commented 1 year ago

cc: @backeb @Jaapel

backeb commented 1 year ago

Hi @sebastian-luna-valero

We are almost ready for trials on a larger scale.

For now, the notebook only processes a single (large) reservoir, (load) testing on multiple reservoirs not yet performed.

Next steps are finishing notebook and testing over multiple reservoirs within bounding box. For scaling I imagine we could define a very large bounding box and a longer period.

At the moment we are relying on the Terrascope openEO back-end at VITO, and we want to use compute at CESNET and access CESNET data holdings for this use case.

We would want to run the methodology in the notebook for larger areas, but I'm not sure how many resources that would consume.

Sorry for not being more helpful. On 18 Oct we have our monthly progress meeting, can we postpone decisions until then?

enolfc commented 1 year ago

@backeb Do you expect openEO to be available also at CESNET? If so we should get started with the deployment asap.

backeb commented 1 year ago

If CESNET wants the Water Watch use case to consume compute, I think it needs to be deployed there. But I'm not sure.

@Jaapel @jdries please advise.

cc @sustr4

jdries commented 1 year ago

@Jaapel could you point me to one or more openEO processing jobs for a reservoir? If I then get the total number of reservoirs, I can do some extrapolation. This could also be useful for the processing at cesnet. (Which I guess would indeed involve the setup of an openEO, or we would have to read from the cesnet collection from our own instance, which would not consume cpu at cesnet.)

Jaapel commented 1 year ago

@jdries Job on the vito openeo 1.1 backend: j-66a66c63af73498ab8edf99505fd13c7 for a medium-sized reservoir for 5 relevant images. There are 413 reservoirs in the Czech Republic.

jdries commented 1 year ago

Thanks, we're looking job metadata, but already found: {"sentinelhub": {"value": 723.9999827742577, "unit": "sentinelhub_processing_unit" 40518694 MB-seconds, 16073 vcore-seconds So that's about 4.5 cpu hours, which is really rather small. (In fact, a lot of this is overhead because the task is so small.)

For the 413 reservoirs, we can round of the cpu needs to like 2000 cpu hours. (2000hours / 8 cores per vm) / 24 hours = ~10 VM's for 1 day

Or 2000 hours / 14 vcpus/ 24 hours per day = ~6 days required to finish processing.

All a bit back of envelope, and please double check, but this suggests that having 14 vcpu's for the next 9 months should do it ;-)

enolfc commented 1 year ago

The openEO deployment runs of top of Kubernetes so there is some additional overhead to consider (e.g. at least 1 VM for master + some other services running on top), still 2K cpu hours over 9 months should not be a problem with 14 cores at all.

So no need to increase resources for the time being.

Any idea on performing calculations that could increase the computing needs?

sebastian-luna-valero commented 1 year ago

As discussed in today's monthly meeting, progress is blocked until https://github.com/c-scale-community/use-case-aquamonitor/issues/26 is solved so we start using the openEO backend at INCD for Aquamonitor and lessons learned are transferred to CESNET for WaterWatch.