NVIDIA / NVFlare

NVIDIA Federated Learning Application Runtime Environment
https://nvidia.github.io/NVFlare/
Apache License 2.0
604 stars 171 forks source link

Supply config at runtime #259

Open Nintorac opened 2 years ago

Nintorac commented 2 years ago

It would be useful to allow specifying configuration files to use at runtime.

The use case I have in mind here is to help with hyperparameter tuning. I want to be able to dynamically generate configurations in a script and then send those directly via the APIRunner API, eg something like

client = FLAdminAPIRunner(
    host = SERVER,
    port = 8003,
    poc = True,
    username="admin",
    admin_dir=os.environ.get('ADMIN_DIR', '/admin')
)

for i, config in enumerate(config_generator):
    client.run(i, APP_NAME, config, restart_all_first=False, shutdown_at_end=False, shutdown_on_error=False)

This could also be useful even for one off experiments as it would allow the use of config management libs (eg hydra).

The closest example I could find is in the cifar example where each app folder only contains the config, I couldn't follow where the app code is actually copied over though.

I think this approach might help to define a differentiation between a run and an app. eg a run is an app + config

yanchengnv commented 2 years ago

We are redesigning the whole RUN management system to make it more like "job scheduling". Basically you will be able to create jobs (app configs) either statically or dynamically, and then upload them to the system. It is up to the system to schedule and run these jobs one by one or in parallel (if resources permit). Once a job is submitted, you will get a unique Job ID (JID), and you can use this JID to query or cancel the job.

In your example, the "config_generator" will need to be able to dynamically generate the job configs in file system, and then upload the generated jobs to the system via the API.

Nintorac commented 2 years ago

We are redesigning the whole RUN management system

Cool, do you have any docs on the plans that I could look at?

In your example, the "config_generator" will need to be able to dynamically generate the job configs in file system, and then upload the generated jobs to the system via the API.

Yes this is what I'm doing. It just feels a little clunky. eg what is the difference between an app and a run atm? ---but if I understand your first paragraph correctly that this will be deprecated then cool :)

pdogra89 commented 2 years ago

@Nintorac - Thank you for your interest and contributions to Nvidia FLARE - I am Prerna and I am responsible for Product Management. I would like to get on a call with you to learn about your use-case and understand how we can help - please feel free to email me at pdogra@nvidia.com and we can take it from there.