Open ablaom opened 10 months ago
@ablaom do you think is a good idea to replicate the environment variable approach implemented in MLJFlow testing suite? I think it can ensure the user to run all the tests correctly.
Sure, unless you have a better suggestion.
When the initial development happened, the assumption was that the developer would have a local mlflow instance running prior to executing the tests. Moreover, it was also up to restart (wipe out) the mlflow instance prior to running the tests another time. This helped with debugging (i.e. final state remains in mlflow if tests fail and developer can debug) and it was also meaningful because actual functionality was tested against an actual instance. The same setup we have in the github CI job - the mlflow instance is actually a separate container that is spun up alongside the test container.
Another way to go would be to mock the mlflow endpoints. This will not require an mlflow instance at all. That's probably the correct way in general and it will allow Pkg.test()
to execute without any service dependencies.
Helpful comments, @deyandyankov, thank you.
Another way to go would be to mock the mlflow endpoints. This will not require an mlflow instance at all. That's probably the correct way in general and it will allow Pkg.test() to execute without any service dependencies.
Do you mean a fully functioning mock up of an mlflow service? This would have these drawbacks:
As maintenance on this site has become exceedingly thin, this doesn't make this realistic option just now, in my view. (Side note: I did actually created a baby mock-up for the POC at #41.)
@deyandyankov I might have missed something, but is there anything proposed in #38 that upsets the testing workflow you just described?
To improve the project testability, I suggest to use BrokenRecord.jl. It brings us a "cassette" approach to record the HTTP requests, removing the necessity of having a mlflow
instance running on our local machine.
However, this can include a new environment variable USE_TEST_RECORDS
, because testing using a true server is still being required for the CI job (and locally if required).
This sounds like a nice idea!
Testing this package in the usual way, on one's local machine, does not work:
Obviously the error thrown in not helpful. I expect one needs an active MLflow service running on your machine, and that it must have the appropriate uri.
Here is a related issue with a suggestion for remedy: https://github.com/JuliaAI/MLJFlow.jl/issues/20