infinyon / fluvio-client-python

The Fluvio Python Client!
https://infinyon.github.io/fluvio-client-python/fluvio.html
Apache License 2.0
13 stars 12 forks source link

feat: add python realization of ProduceOutput and RecordMetadata #393

Closed dariogoetz closed 6 months ago

dariogoetz commented 6 months ago

This is a WIP pull request addressing issue #389.

It adds two python representations ProduceOutput and RecordMetadata allowing to retrieve metadata for records sent by the TopicProducer. The implementation tries to mimic the existing rust interface. However, the PyO3-wrapper ProduceOutput places the inner (native) ProduceOutputinto a Cell because its only method mainconsumes self, which can not be realized with a PyO3-wrapper. This means that a subsequent call to the wait method will find a Defaultversion of the object and result in a ProduceError::GetRecordMetadata(None) (in pure Rust, such a subsequent call is impossible as the underlying ProduceOutput would not exist anymore after the first call).

I would like to hear your suggestions regarding this realization using Cell and if you have some good idea to avoid such a construction.

digikata commented 6 months ago

Added a comment in parent issue about maybe using PyO3 IntoPy trait for this?

dariogoetz commented 6 months ago

I am not sure that it would help here (but I may have a wrong understanding of its working.

Using the IntoPy trait, would let the ProduceOutput appear like a RecordMetadata to the python world, if I understand it correctly.

But this is not really the intended effect, is it? It should be up to the user to decide whether they want to wait for the RecordMetadata or not (in particular, if it is a blocking call). So the call to wait should be explicit. And in order to be able to execute such a call, it needs some object to be bound to (the ProduceOutput). This in turn means that a ProduceOutput object needs to be available in python land.

But my understanding may be wrong.

dariogoetz commented 6 months ago

Another option instead of using Cell<ProduceOutput> for the inner part and produce errors on subsequent calls would be to use an Option and return None for subsequent calls to wait.

digikata commented 6 months ago

Using an option for the inner wrap would be a way to avoid the Cell and I think read a little clearer in the intent.

dariogoetz commented 6 months ago

I updated the async send variants and added an asynchronous async_wait. I also added tests for waiting on RecordMetadata both for sync and async variants. I also added docstrings.

If the CI/CD goes through, the PR is ready from my perspective.

I do have a failing test, though it is not mine and I am not sure if it is related: test_fluvio_python.TestFluvioAdminTopic.test_admin_paritions. Also, there is a typo in the test's name: I suppose that is shall be called test_admin_partitions.

PS: I can squash the commits, if required.

digikata commented 6 months ago

I think this looks good to me code wise. The CI all passed, so is the failing test on your system?

dariogoetz commented 6 months ago

It is a test failing locally on my installation, yes. To be honest, I don't quite understand, what that test is supposed to do. It connects to the cluster, does nothing and then expects there to be topics available (where should they come from?). I would actually assume there to be no topics, so instead of assertNotEqual I would expect a assertEqual. And I wonder, why the test does not fail in the CI/CD pipeline. I could misunderstand the test, though. I am certainly not familiar with the whole module.

dariogoetz commented 6 months ago

The commits are now squashed into one.

digikata commented 6 months ago

I've run the tests on my system and they all pass. The test could fail if you don't have a currently operable fluvio profile that connects to a running cluster. The CI sets that up separately (and I have one by default too ). The simplest way is to make sure a local fluvio cluster is started first fluvio cluster start.

Merging the PR though, and thanks for the work and attention!

dariogoetz commented 6 months ago

Thank you for the merge.

I do have a running cluster. I believe that the test fails if the original state of the fluvio cluster does not contain any topics (i.e. it is an empty cluster).

Could you check, if your test also passes when you start the tests on a cluster with no topics existing in the beginning?

digikata commented 6 months ago

Yes it seems to need a topic. and fails withoug. Seems like a temporary topic should be created as part of the test (as happens in many other tests already)

Added an issue to remember to update this #397

digikata commented 6 months ago

Keeping an eye on maybe needing to revert this temporarily. Looks like CI runs are sometimes taking a long time as well as build errors that when integrated w/ a dev build PR (https://github.com/infinyon/fluvio-client-python/pull/396) https://github.com/infinyon/fluvio-client-python/actions/runs/8361426988/job/22890298839