Open DavidLandup0 opened 1 year ago
@tanzhenyu What is our (sustainability) policy about brand new papers?
We had a quite long discussion with @LukeWood and @innat at https://github.com/keras-team/keras-cv/discussions/52#discussioncomment-2058663
Thanks for linking the discussion! My two cents are: if an arch can be useful for end-users, we should consider it based on how modular it can be (big factor for KCV), how reproducible it is (simple vs difficult training pipeline, big or small models, official public repo or not, etc.), how widely accepted it is (how many implementations and usages over the world), and how useful they could be for end-users.
A computer vision library dedicated for auto-driving, robotics and on device applications.
While the vision may change, it'll conceivably stay under the path of "bringing CV to production". If a model is aligned with it and shows clear advances, it should be added IMO.
With brand-new archs, it's hard to test whether they show clear advances without further peer review and usage but, we already have ConvNeXt in both keras.applications
and keras_cv.models
, which validates the structure and usefulness for ConvNeXtV2.
Also, the acceptance criteria might be different between backbones and narrow architectures - backbones are generally useful. Not all backbones improve downstream tasks though. IMO, if a backbone reports improved downstream task performance, we should give it more weight.
I think that especially for a backbone the test of time is how many papers are using the specific backbone (in this specific case it was just published few days ago).
Then IMHO we have a sustainably open topic over accumulating components as we still don't have a clear codeownership of the library/modules and we are missing its relative MIA handling (https://github.com/keras-team/keras-cv/discussions/1184 and https://github.com/keras-team/keras-cv/discussions/950#discussioncomment-3926423). If we rely only on the few internal team members I suppose that we will have a maintainership boundary/bottleneck soon or later as in any process working with quite limited resources.
Then I agree that just porting reference weights it is less time consuming and risky but we have still not clarified our aim to train or not from scratch also backbones and eventually what is our opinionated position on eventually releasing weights with a "starting gap" on downstream task when we try to train backbones from scratch.
I think in production, when working on downstream tasks, many users are interested to start from top performing pre-trained perfs. So sometimes, with limited resources, I prefer downstream reproducibility of our training scripts from well known performing weights (this is still quite confusing https://github.com/keras-team/keras-cv/issues/495).
On the other side as I have seen in many PRs our training process is still quite "artisanal" (https://github.com/keras-team/keras-cv/discussions/954) and also it is often quite hard to be on the same page about the reproducibility process. Also my impression is that we still need to target just free resources (colab) if we want pre-CI training proxy check on the contributor side to not create a contribution wall for the requested hw resources.
/cc @tanzhenyu @LukeWood @ianstenbit @martin-gorner
52 (reply in thread)
Thanks @DavidLandup0 @bhack for the discussion. There's no golden threshold here, but there's a core value we need to stick to, which is "KCV is for production and applied ML". So below are a couple of factors:
The list can go on and we don't have immediate plans to write down what should be the standard, but we try to be agile and answer questions such as "should we include XXX model".
As for this model, I think it's ok to include it, but not prioritized at this moment
I think in production, when working on downstream tasks, many users are interested to start from top performing pre-trained perfs.
Agreed. Many production use-cases boil down to replacing a backbone with a slightly better one. This includes Kaggle competitions, which can IMO benefit a lot from KCV. For tricky-to-train models, like ViTs and ConvNeXt - we might want to focus on serving them primarily for downstream tasks (fine-tuning, segmentation, object detection, etc.).
We could then separate "trainable" and "tricky-trainable" architectures, where we produce the weights with public scripts here for the former, but port for the latter.
I agree with @tanzhenyu as well that it might be tricky to create a list and check whether a model fits criteria. Maybe sometime down the line, we make a list of say, 10 bullet points, and accept an arch if it ticks 6+ of those, for example. For now at least, while there's still lots of ground left to cover, we can probably sensibly make decisions on the fly?
Yes I think that my points are general enough end not related to an "algorithm" for the inclusion.
My points are more focused about the general susteinability of the library (codeownership) and to not increase too much the contribution barrier about hw resources (devinfra/CI).
So I think these points could be still handled in the relative tickets/discussions:
But I have already mentioned the relative tickets/discussions for all these points so I don't think we need to discuss these here if we could make some progress in the relative threads in 2023.
Short Description Just released - ConvNeXt with a new internal layer.
Papers https://arxiv.org/abs/2301.00808
Existing Implementations https://github.com/facebookresearch/ConvNeXt-V2
Other Information If accepted, sign me up for the PR!