ITISFoundation / osparc-issues

🐼 issue-only repo for the osparc project
3 stars 5 forks source link

Different S4L Python Runners have different default resources #1762

Closed JavierGOrdonnez closed 1 day ago

JavierGOrdonnez commented 1 month ago

N.B. I will circunvent this issue by using the 8.0 v1.2.0 flavor (right defaults) but should be fixed if this one is meant to be deprecated.

Long Story Short We have a study with different Python Runners, created by Antonino (whom I believe, adjusted the resources of each of them). When converting to a template, they revert to default values (which I would guess is the expected behaviour? could be discussed whether this is desirable to change, or too much a niche case to be worth the effort). Thus, when cloning and running the pipeline, errors arise due to lack of resources.

Expected Behavior All flavours should get 4 CPUs and 32GB RAM by default.

Actual behaviour There are three versions of S4L Python Runner: image which displays as Sim4Life Python Runner 8.0 1.2.0. This one gets assigned 4 CPUs and 32GB RAM by default.

image which displays as sim4life python runner 8.0.2. This one gets assigned 1 CPU and 2 GB RAM only by default. This is a problem.

image which displays as sim4life python runner edge 8.1. This one gets assigned 1 CPU and 2 GB RAM only by default. This is a problem.

Steps to reproduce Clone the template: 2ae4b01e-8709-11ef-a023-0242ac1752b8 (Y7-MS 11.2.1.4_for_Metamodeling (WVG))

wvangeit commented 1 month ago

Same issue: https://github.com/ITISFoundation/osparc-simcore/issues/6518

JavierGOrdonnez commented 1 month ago

Thanks @wvangeit for the link. As a clarification: Werner issue addresses keeping resource specifications over the template creation & instantiation process. T This issue is specifically about the S4L Python Runners, whose defaults should be modified.

mguidon commented 1 month ago

Hi. The defaults are kept to a minimum on purpose. If you need to increase the resources, this should be done via personalized resources in the UI.

mguidon commented 1 month ago

I think the cloned study should only fallback to default resources only if the sharee does not have enough permissions to change them

JavierGOrdonnez commented 1 month ago

Probably that is what happened - Werner does not have permission to change them.

However this is problematic imho. If I create a template for a collaborator, I would like to be able to share that template with external collaborators in a functional state, and without needing to give them the right to change resources in all their services in all their studies.

JavierGOrdonnez commented 1 month ago

With regards to minimize resources, I would agree for generic python runner. But for S4L python runner, I cannot think of almost any task which requires S4L and will fit within such constraints -- most S4L runners are used for either / both simulations and postprocessing which take significant resources. Even for a simple data extraction from S4L (the simplest task, compute-wise, that I can think of), it is likely that the model + simulation to be extracted will be more than 2GB.

wvangeit commented 1 month ago

I agree with @JavierGOrdonnez . Imo a template should be an 'exact' copy of the original study, and the resources should be part of that copy. Implicit downgrading of these resources is confusing. If the user can't run the template because he/she has no access to these resource, a warning should be shown to the user. There are two options imo. Or we don't downgrade the resource, and we can still warn the creator the template that it might not be able to be run by all people, because the resources are modified. Or, we do force a downgrade of the resources, but then we definitely warn the creator of the template this is happening, and that the template should probably be tested with the downgraded resources.

mguidon commented 1 month ago

What we did in the past with collaborators was:

  1. Create a dedicated group
  2. share all services from the study with them
  3. Override the resource specs to the required amount for all members of the group, or give all members of the group acess to resource overriding. (I prefer the formr)

Unfortunately, step 3 needs to be performed in the db direclty.

JavierGOrdonnez commented 1 month ago

Thanks for the clarification. For me, this sounds messy and prone to human error. Furthermore, I would argue that the resource specifications should be attached to the template / study and not to the user, to ensure consistent results (same template / study should run no matter which user is running it).

mguidon commented 1 month ago

Well, you are saying this because you do not have to pay for the usage 😂

Lets discuss this in person.

wvangeit commented 1 month ago

This seems to be rather a waste of resource to me, because a user might start running a study, that might fail at some point because a service has not the required resources.

The cleanest would be that the resources are attached to the study (i.e. nodes inside a study), and when running the study, it stops the users from the beginning when he/she doesn't have the required resources. So that they user can decide to increase these, and as such knows about the consequences wrt credits etc.

JavierGOrdonnez commented 1 month ago

Fyi publishing my version of the study (in which all services have 4CPU and 32GB RAM) as a template and then opening it again, results in reset resource values. I am the Owner of both the initial study and the template, and I haven't shared it with anyone else.

image

JavierGOrdonnez commented 1 month ago

This is quite critical - it is not possible to run metamodeling with these defaults. @mguidon I propose the defaults are increased for the Metamodeling user group, asap, so that we can move on. Then we discuss & implement a more general approach. Would that be possible?

mguidon commented 1 month ago

I see. That is clearly not how I was assuming this works. So lets go for option 3a then. Lets move to PM

sanderegg commented 1 month ago

@wvangeit @JavierGOrdonnez @mguidon Please check:

which are bugs with the template creation regarding resources.

Nevertheless, I would like also to highlight that this current process of creating templates for meta modelling is absolutely wrong and we should fix it and allow for batch creation instead of that cause now we start chasing dragons:

mguidon commented 1 month ago

Thanks @sanderegg I just wanted to starting reproducing this.