kubeflow / examples

A repository to host extended examples and tutorials
Apache License 2.0
1.41k stars 757 forks source link

[Enhance] Image enhancement example #59

Closed cwbeitel closed 6 years ago

cwbeitel commented 6 years ago

Goals:

Steps:

Potential additional or non-steps:

Current PR: https://github.com/kubeflow/examples/pull/60 Readme: https://github.com/cwbeitel/examples/tree/enhance/enhance

jlewi commented 6 years ago

This is a great idea.

Would it be possible to merge the aspects proposed in this example with one of our existing examples, so that we can collectively work to produce a set of high quality examples?

We have two main examples in progress

/cc @elsonrodriguez @yupbank

cwbeitel commented 6 years ago

I think people should communicate about what their barriers are and, where there is a bit of tooling to be shared, break that out into tools/. I pointed @ankushagarwal to my launcher code yesterday and got in touch with @texasmichelle yesterday and we're meeting up Friday to look through it and talk about the various issues.

My opinion is that examples should stay separated by application (and not be grouped by input or output modality or model type).

I'm not clear on how the issue summarization example is using t2t currently (it looks like it isn't).

Also I'll note that the above examples, as far as I understand, are meant to be e2e via serving accessible via a web page which is a distinct sense of being e2e compared to the batch data pipeline I'm demonstrating here.

If the launcher, job models, and utils for shipping workspace code were broken out into tools/ what would remain in this example would be only the docs and the t2t_usr_dir containing the data download utilities, t2t Problem definition, and a convenience wrapper for t2t-decoder. This level of simplification could be shared by other examples that go the t2t route.

jlewi commented 6 years ago

Good solutions are a lot of work.

All of which leads me to conclude that I think we will be much more successful if we can build a community of folks building and maintaining various examples.

The 2 examples, I mentioned above are ones with momentum that I think could be used to accomplish at least of the goals of this issue.

For example, why couldn't we add a batch prediction component to either of those two examples?

cwbeitel commented 6 years ago

I think the best course of action is for me to continue hacking around with this example separately and communicate with @texasmichelle and @ankushagarwal about how much of the strategy I'm advocating here they're interested in incorporating. If they're interested in using enough of it then I'll just contribute to that example without loss of benefit to myself and it sounds like with increased benefit to this project. If it's something in between then we'll figure something out.

jlewi commented 6 years ago

I think it would be great if we could find a way to have more people working together to produce a small subset of high quality samples that can be used to highlight Kubeflow.

The core value of Kubeflow is that we make it easy to deploy and manage all the components needed to do ML.

So having a small set of samples that each highlight a bunch of those components allows us to tell a much better story than trying to build a new example to highlight each component.

Furthermore, incorporating new components into the samples should be much easier. For example, if you want to highlight inference you don't need to first create a sample to train the model because it already exists.

For example @elsonrodriguez is developing an example based on mnist to highlight the KVC Intel has been developing.

@yixinshi needs an example suitable for large scale batch inference especially with GPUs (kubeflow/kubeflow#251)

We have kubeflow/example-seldon to highlight serving with Seldon (GitHub Issue summarization is also using Seldon).

So right now we are missing samples that demonstrate a lot of things

There is a list of possible examples here

Some other examples that might allow us to check a lot of boxes are

cwbeitel commented 6 years ago

I hear you. I'll talk to @texasmichelle tomorrow about how I can contribute there.

The video labeling and next-frame prediction problems are very interesting to me and I would like to work on that at some point.

Toward getting people more integrated we could start having weekly example developer meetings for ~45min or 1h where people go around and present progress, challenges, prototypes, etc.

jlewi commented 6 years ago

Website for object detection including tools and models https://github.com/openimages/dataset

cwbeitel commented 6 years ago

Cool so given the above and after the discussion today it's clear to me that the above exercise is of value to the project but not currently as an additional example in this repository. Closing this issue and the related PR and we'll have separate discussions about what individual pieces might be incorporated into existing examples if at all. The air-tight front-facing kubeflow examples can always link to additional applications elsewhere.