JuliaAI / MLFlowClient.jl

Julia client for MLFlow.
https://juliaai.github.io/MLFlowClient.jl/
MIT License
42 stars 8 forks source link

Proposal to buffer service requests #41

Closed ablaom closed 1 month ago

ablaom commented 4 months ago

The context of this proposal is this synchronisation issue.

The main problem with logging in parallelized operations is simply this: requests are posted directly to an MLflow service without full information about the state the service at the time the request is ultimately acted on. I propose we resolve this as follows:

I imagine that we can insert the queue (buffer) without breaking the user-facing interface of MLFlowClient.jl.

I have implemented a POC for this proposal and shared it with two maintainers, and can share with anyone else interested.

pebeto commented 1 month ago

This requirement is a very specific task. Not everyone is using multithreading/multiprocessing to perform this kind of operations. MLFlowClient.jl is mirroring the capabilities the original package is performing. So, in my point of view, we must not implement a buffering solution here. This is something the user will take care of. In the MLJ.jl context, our library MLJFlow.jl contains two POC workaround using Locks and Channels. It can be seen here JuliaAI/MLJFlow.jl#36.

ablaom commented 1 month ago

I'm not 100% convinced. It seems to me any other Julia software that wants to do mlflow logging will run into exactly the same issue if they have parallelism. However, for now I'm happy to shelve the proposal in favour of the specific solutions you have worked out, thank you!