improved packaging - Githubissues

rom1504 commented 2 years ago

Hi, As part of our package to easily evaluate clip models, https://github.com/LAION-AI/CLIP_benchmark/issues/1 and my inference lib https://github.com/rom1504/clip-retrieval

I'm interested to have a package like this

however here is what's missing here:

pypi packaging
much clearer README, the usage of this should be 5 lines of python that can be copy pasted
performance evaluation
possibly optional dependencies to avoid the dependency list to become too large

I may be interested to contribute all that, but first I'd like to check with you if you're ok with that kind of changes

thanks

dmarx commented 2 years ago

absolutely! for additional context, here's some discussion of project goals:

https://github.com/dmarx/Multi-Modal-Comparators/pull/14#discussion_r860348282
https://github.com/dmarx/Multi-Modal-Comparators/issues/4#issuecomment-1102728236
facilitating/backdooring support for perceptors trained on languages other than english in AI tools for creatives

Responding to your specific suggestions:

1/4: Yes, I absolutely would love to have this hosted on PyPI, if feasible. My priority here is first and foremost facilitating access to pre-trained checkpoints. To this end, I want to streamline the installation process as much as possible. Hosting on PyPI is definitely aligned with this, as long as it doesn't come at the cost of significant followup installation complexity/overhead. I came up with a pattern for packages that can't be properly installed that has them automatically "installing" themselves (if they haven't already been), triggered by the loader's .load() method. This way, the user only has to go through the setup for what they need when they need it, and the actual performance of the setup procedure is abstracted away from them. Maybe we could do something similar for other optional dependencies? CLOOB example: https://github.com/dmarx/Multi-Modal-Comparators/blob/main/src/mmc/loaders/cloobloader.py#L45-L48
2: yes, fully agree.
3: I have mixed thoughts here. I'm hoping to cast a fairly wide net in terms of the kinds of checkpoints that can be loaded by this library, which means supporting models trained on unrelated tasks and incompatible languages. I'm certainly not opposed to independently benchmarking, it's just not a core priority and I'm a little worried it will add a lot of complexity as the number of models grows. A concrete example of how inter-model comparison is already an issue: I accidentally used a prompt for a comparison experiment that was unfair to several of the models I included because they were trained on data that had been sanitized of proper nouns (yfcc15m). I've been working towards adding metadata for recording supported languages and datasets used for training/finetuning, so maybe this wouldn't actually add too much complexity. Interested to hear more of your thoughts here.

tl;dr: I would absolutely love to have your support and invite your contributions. The current packaging (especially the installation) is a consequence of the problem I am trying to solve and I am definitely open to solving it in other ways.

rom1504 commented 2 years ago

ok sounds good, I'll get started on the pypi part my plan is simply to get into pypi all the dependencies, so we can directly pypi depend on them did that in the past with clip (https://github.com/rom1504/clip) clip-anytorch module, I can do the same (but preferrably by doing PRs directly to the main repo) for other things. For places that don't accept PR, fork and publish works

rom1504 commented 2 years ago

hmm there's just one last thing I'm wondering: is it better to build our own thing (like you did), or plug ourselves into an existing platform like huggingface models, https://hub.jina.ai/ or https://github.com/UKPLab/sentence-transformers ? did you investigate this topic?

dmarx commented 2 years ago

ok sounds good, I'll get started on the pypi part my plan is simply to get into pypi all the dependencies, so we can directly pypi depend on them did that in the past with clip (https://github.com/rom1504/clip) clip-anytorch module, I can do the same (but preferrably by doing PRs directly to the main repo) for other things. For places that don't accept PR, fork and publish works

this was the approach I was originally planning to take as well, but @apolinario made a great point in an earlier discussion that this approach doesn't scale well. I created napm as an alternative pattern for managing dependencies that couldn't be pip-installed: we could probably extend the functionality/recipe for packages that have a setup.py but aren't hosted on pypi. it's a sort of hacky solution, but it doesn't commit us to making sure 10s or 100s of forks are kept up to date. Have you had a chance to take a look at the CLOOB example I linked?

is it better to build our own thing (like you did), or plug ourselves into an existing platform like huggingface models, https://hub.jina.ai/ or https://github.com/UKPLab/sentence-transformers ?

That's definitely something we could consider further. I suspect this approach probably has similar scaling issues to the forking thing above, but that might be something we could get around with clever CI.

I've generally experienced a lot of headache associated with HF's CLIP implementation and at present, the only models that mmc is still experiencing issues working with are the ones it pulls from HF (CLIP-fa and SBERT-mCLIP). I'm a little concerned about tacitly directing researchers to hosting their CLIP models on HF if it means they'd be more likely to gravitate towards the troublesome HF CLIP implementation. If we go the HF route, we might want to start with adding a more HF-friendly base CLIP implementation or improving the one that's up there.

I haven't used Jina-AI in a while, but a teammate was recently singing their praises over jina's clip-as-a-service inference API. I'd already been thinking about adding a class to mmc to support inferencing off APIs like jina's and HF's (#24): if we could set up hosted inferencing APIs for the models we hope to support that could be an extremely powerful paradigm. I think Jina's CaaS thing is essentially a product and not a generic inferencing API they're planning to let people plug their own models into for free. However, as I'm sure you're already aware, this is a feature the HF hub already supports.

so I guess it all comes to some cost-benefit analysis. We're currently using a duct-tape-and-bubblegum approach that let's us add support for new models fairly quickly without requiring that we maintain forks or push changes upstream, at the cost of ignoring a lot of dependency management best practices and just crossing our fingers that nothing breaks. We could prioritize improved stability, but it comes at the cost of a potentially significantly higher LOE to onboard new models and commits us to ongoing maintenance that's abstracted back to the reference repo by the hackier approach.

...maybe we can have our cake and eat it too? we can use the pattern we've already established here for getting some simple support quickly, and migrate more popular models into hosted (jina/hf) and/or better supported (upstream PRs, forks) solutions? Maybe we could get AK or someone who plays with HF-hub/spaces to chime in on this?

rom1504 commented 2 years ago

Yeah all good points. So I propose I'm going to try contributing on your solution first, then we'll see how things go, we can decide to move code to a platform later on

dmarx commented 2 years ago

@rom1504 just checking in. have you had a chance to take a stab at this? would love it if we could distribute this on pypi rather than asking users to build the wheel locally

rom1504 commented 2 years ago

Hey, sorry didn't have time yet I still believe it should be done though.

If you want to start things yourself, https://github.com/rom1504/python-template/blob/main/.github/workflows/python-publish.yml you could start by copying this to the same folder

Example of recent PR to do this in another repo that i just did https://github.com/TheoCoombes/ClipCap/pull/5

rom1504 commented 2 years ago

You'd need a setup.py too, can take example from the template repo too

I'll take a look soon

dmarx commented 2 years ago

I think it's a bit more involved than that because of the github dependencies. This is the reason I didn't put it on PyPI myself already: the wheel failed a validation of some kind. I don't think PyPI will host wheels that block the dependency resolution flow. Not sure what I did with my notes.

One approach I was thinking we could take (which would also be better for future scalability than the current approach of requiring users to install the codebases for every clip variant mmc supports) would be to have an internal download/install mechanism that ships with mmc, similar to how nltk's model and corpus downloads work. that specific approach is a bit clunky, but that's my current thinking at the moment anyway.

Another tack we could take would be to push the installations into the loaders, so like if a pip-installable git dependency can't be imported, the loader emits os.subprocess(f"pip install --upgrade git+{url}") or something like that. I dislike this approach for a variety of reasons. Considering the entire project here is essentially a collection of square-peg-round-hole adaptors, maybe something like this is among the least-worst solutions?

rom1504 commented 2 years ago

Why can't we simply fix all the dependencies to be installable from pypi ?

dmarx commented 2 years ago

that was the approach I was originally taking and apolinario convinced me otherwise. See the discussion here (tldr: scaling concerns): https://github.com/dmarx/Multi-Modal-Comparators/pull/14

Also, making these dependencies installable from PyPI is itself potentially non-trivial. Consider for example Kat's CLOOB, which itself has three github dependencies. Specifically, webdataset, jmp, and CLIP. CLIP does have a PyPI hosted fork, but it hasn't been updated since October and consequently its clip.available_models() is missing a few late-release checkpoints. webdataset is only needed for training, so that's a droppable dependency, but jmp is still needed for jax inference. So there's at least potential for a PyPI hosted pytorch-only CLOOB perceptor, but you can see how these edge cases and caveats can stack up. And that was using an example authored by an extremely talented developer, whereas a lot of the dependencies relevant to this discussion will probably be a lot messier.

rom1504 commented 2 years ago

Yes i do think this is the only scalable approach. I've been maintaining that clip deployment https://pypi.org/project/clip-anytorch/ and it takes very little time

Properly publishing all these packages will not only help this aggregator package but also all direct usage of these packages

rom1504 commented 2 years ago

If you mean non scalable as in "this will install too many things", that could be a better point but then my opinion is the same: we can fix the packaging of all the dependencies

If some rare dependencies absolutely need too many things, then we make them optional and tell people to pip install them themselves for usage (and/or run your function to install them, which is the same)

dmarx commented 2 years ago

I meant "non-scalable" in the sense "this will likely create a lot of technical burden on the mmc maintainers, which is mostly myself." If you want to take responsibility for converting mmc dependencies into installable packages and keeping installable forks up-to-date with upstream changes, I have no problem with redirecting mmc to point to your forks. I'm not personally inclined to take that responsibility myself for the majority of mmc's dependencies. My current strategic plan is to prioritize making these pre-trained models broadly available quickly: if certain models shipped in this fashion prove to be particularly useful, then it might make sense to create and maintain PyPI hosted and installable forks of those specific repositories, the intention then being to facilitate downstream users getting access to those tools without dealing with the tangled mess the rest of the models bring with them.

I'd like to have this hosted on PyPI, but asking users to have one slightly weird package setup doesn't feel like a big ask to me if it is replacing a bunch of other (potentially weirder) package setups. It's not perfect, but it's a huge improvement over how our target users are currently being forced to do things.

So to reiterate: my concern here is about how your suggestion will impact my personal workload. If you would like to take responsibility for that work or can find someone else who is open to it, I have no problem redirecting pyproject dependencies to point to PyPI instead of github. But creating and maintaining those forks is not a responsibility I'm inclined to volunteer for at the moment for a variety of reasons which I think I've already explained to the best of my ability.

dmarx commented 2 years ago

also: I didn't even know clip-anytorch existed.

rom1504 commented 2 years ago

I see! Yeah so I expect the packaging work of all the packages is much less work than you think But I do hear your concern, so I'll just do it when I have some time to show what I mean :)

rom1504 commented 2 years ago

Also maybe to clarify one thing: the reason why I want this to be packaged properly with true dependencies on pypi and all, is that it's the only way I know to make this clean enough so I can depend on it from other places with confidence that it will keep working. So for me it's a necessary point.

dmarx commented 2 years ago

yeah for sure, I totally get that. As I mentioned, I've been there myself.

rom1504 commented 2 years ago

ah I've got this one too https://pypi.org/project/taming-transformers-rom1504/#history :D

dmarx / Multi-Modal-Comparators

improved packaging #26