JDASoftwareGroup / kartothek

A consistent table management library in python
https://kartothek.readthedocs.io/en/stable
MIT License
161 stars 53 forks source link

Move to minimalkv #462

Open xhochy opened 3 years ago

xhochy commented 3 years ago

Description:

Since simplekv is unmaintained and we cannot add new maintainers, this PR moves to its successor project minimalkv

fjetter commented 3 years ago

For the sake of compatibility, I'm wondering if simplekv stores are still compatible or not. and how/if we can ensure this and manage the deprecation properly

xhochy commented 3 years ago

For the sake of compatibility, I'm wondering if simplekv stores are still compatible or not. and how/if we can ensure this and manage the deprecation properly

We didn't change the API in any aspect, the move was solely a rename of the top-level namespace. Passing in an existing simplekv store or a storefact-URL will work without issues. Having simplekv / storefact installed alongside minimalkv also works.

xhochy commented 3 years ago

Verified locally that https://github.com/data-engineering-collective/minimalkv/pull/24 fixes the documentation build issues. I'll retrigger the PR here once that has been released.

xhochy commented 3 years ago

All tests passed except the one that is also failing on master. This is probably an issue with the latest pandas nightly and unrelated to the simplekv -> minimalkv switch.

xhochy commented 3 years ago

Figured out the issue we see in the tests confirming that it is totally unrelated to this PR:https://github.com/dask/dask/issues/7610

OK to merge?

stephan-hesselmann-by commented 3 years ago

Thank you @xhochy for the contribution and for looking into the broken nightly test. We are discussing a possible switch to minimalkv internally, will get back to you.

xhochy commented 3 years ago

@stephan-hesselmann-by Any update here or any information you would need? I did contact BY before starting minimalkv but as I'm no longer part of it, I'm not sure I reached out to all relevant parties. Switching should work quite easily as this is designed with care so that you can mix & match simplekv, minimalkv and storefact in one environment easily until the whole stack is migrated to minimalkv.

fjetter commented 3 years ago

If it helps with migration efforts we could think about implementing a proxy in kartothek (yet anotherget_store) which would point to the "current" backend store lib. FWIW, I don't think this is really necessary as long as minimalkv doesn't change the class interfaces without announcement

stephan-hesselmann-by commented 3 years ago

Unfortunately I don't have a definitive answer yet. The issue for us is not only Kartothek but also other Blue Yonder artifacts that currently use simplekv. That is also the reason for the indecisiveness.

@xhochy if we do decide to migrate to minimalkv, would you be willing to grant Blue Yonder "co-ownerhsip"? We would want to avoid another simplekv situation in the future and also make sure that the interface stays compatible with simplekv as long as it is used by Blue Yonder.

xhochy commented 3 years ago

Unfortunately I don't have a definitive answer yet. The issue for us is not only Kartothek but also other Blue Yonder artifacts that currently use simplekv. That is also the reason for the indecisiveness.

They can use simplekv as a direct import and as a dependency, no need to change anything there. We already use minimalkv together with Kartothek and have simplekv and storefact installed.

@xhochy if we do decide to migrate to minimalkv, would you be willing to grant Blue Yonder "co-ownerhsip"? We would want to avoid another simplekv situation in the future and also make sure that the interface stays compatible with simplekv as long as it is used by Blue Yonder.

Happy to give access to people who contribute (see http://theapacheway.com/merit/), I don't want to hand over control to Blue Yonder just because they use it*. What would be your benefit / intention here exactly?

*You neither have full access to pandas either?, or more concrete: I would also like to have access back to kartothek because I use and advocate for it (also created it) but at the time of writing, I don't have that, putting me in the same situation. Thus for me this kind of argument is quite weird.

xhochy commented 3 years ago

Note I gave @fjetter access to minimalkv https://github.com/data-engineering-collective/minimalkv/pull/38 as he helped in the past and current time with it a bit. It may not be so relevant for you but that adds a QuantCo-independent maintainer to it.

fjetter commented 3 years ago

simplekv is de-factor not maintained and ownership is questionable. As we've seen with simplekv, both ownership and question does also not necessarily connect with admin or write access. minimalkv is maintained and ownership is clearly defined. This PR removes therefore a high risk component which caused issues in the past with a better alternative.

The issue for us is not only Kartothek but also other Blue Yonder artifacts that currently use simplekv. That is also the reason for the indecisiveness.

Whether or not BY decides to "migrate" from simpelkv to minimalkv is irrelevant for this PR (although I can highly recommend). If you decide to use both, you are free to do so. The question for this PR would be more appropriately whether this change is breaking for the library kartothek. As it stands right now, you can use both libraries but by default we'll install minimalkv with this PR.

fjetter commented 3 years ago

https://github.com/JDASoftwareGroup/kartothek/pull/376 implemented duck typing for the storage interface. as long as this stays, the choice for simplekv or minimalkv should not impact functionality

stephan-hesselmann-by commented 3 years ago

I agree with your points and also share some your frustrations with this situation. I will try to argue in favor of this PR but as you know I cannot guarantee success.

mlondschien commented 3 years ago

Are there any updates on this?

felix-marczinowski-by commented 3 years ago

The original Maintainer of simplekv has agreed to add additional maintainers/hand over maintainership. @xhochy is one of the designated new maintainers, which should resolve the situation that lead to the fork.

I hope that once the formal changes are through, we can go back to just simplekv, this time with a bit more lively development!

xhochy commented 3 years ago

I don't think that adding maintainers to the existing mbr/simplekv repository fixes the root of the issue. We will still not be able to fully manage that without the help of mbr.

Also, contrary to any mails, I have not yet received any rights over there.

felix-marczinowski-by commented 3 years ago

As I understood, mbr would extend/transfer ownership and not keep a special role. You would have all the rights he and the other new maintainers have.