NVlabs / SPADE

Semantic Image Synthesis with SPADE
https://nvlabs.github.io/SPADE/
Other
7.61k stars 982 forks source link

[Idea] Crowd-training of unreleased Flickr landscape model #17

Open sidyakinian opened 5 years ago

sidyakinian commented 5 years ago

As you probably know, Flickr landscape pre-trained model could not be released in this repo. But that model can draw landscapes with unbelievable quality, much higher than that of coco-stuff, due to training on 40k Flickr images; the fact that it hasn't been released is disappointing.

Some of us probably want to train it yourself. (Me for one, and also @Lokiiiiii brought this up) Typically it would cost a few thousand dollars. Thankfully, Product Hunt offers a paid subscription which basically offers $5000 AWS credits for $720: https://www.producthunt.com/ship#launch (Product Hunt Ship Pro, yearly subscription)

This gets the cost down to $720, but it's still a lot. Since a few of us are going to do the same exact thing, why don't we train the model together and share the cost? $720 split among 5 people is already $144, which is fair for such a powerful model.

Once we have a few people in, we can start a crowdfunding campaign, pledge funds, train the model and share it among us.

What do you think of this?

code-de commented 5 years ago

That's a great idea! Count me in.

hologerry commented 5 years ago

And me.

banyet1 commented 5 years ago

Also count me in.

rslowinski commented 5 years ago

on what license such model would be?

sidyakinian commented 5 years ago

on what license such model would be?

@ares97 Likely the same license that SPADE was released under

sidyakinian commented 5 years ago

Okay, looks like we have a few people. Let's form a group chat in some messenger to discuss the next steps. @code-de, @hologerry, and @banyet1, would you please email me or just comment here which few messengers among these you find convenient?

We can then choose the messenger everyone picked and host the group chat there. My email is on my profile page.

@harsh2204, @wasd96040501, if you decide to join, you're welcome to email me too!

hologerry commented 5 years ago

Telegram would be great. Same username as Github.

code-de commented 5 years ago

Telegram, Discord, WhatsApp work well for me - emailed you my usernames in each of them. Btw, @ares97, @Lokiiiiii, @aeti-in - you guys seem to be interested in this as well?

genekogan commented 5 years ago

@sidyakinian i am also interested in this. please include me!

sidyakinian commented 5 years ago

@sidyakinian i am also interested in this. please include me!

@genekogan Sure! Please remember to email me or comment some messengers so that you can join.

banyet1 commented 5 years ago

I'm using WhatsApp, just sent an email to u, please check it out.

sidyakinian commented 5 years ago

@hologerry, @banyet1 So you guys picked different options, Telegram and WhatsApp, could one of you use the other messenger so that we can gather in one chat? Either of those two is fine with me and @code-de

banyet1 commented 5 years ago

I'm using telegram as well, identical with WhatsApp.

aman-tiwari commented 5 years ago

I assume there are no segmentation masks for this dataset available, right? So those will also have to be made (or inferred using another network)

sidyakinian commented 5 years ago

@aman-tiwari Yes, we'll have to create the dataset ourselves. Thankfully, SPADE researchers used DeepLabV2 for it, which works pretty quickly. We'll just use that or something similar.

noyoshi commented 5 years ago

Not sure how much help I could be for actually training the network, but I would love to know if / when this happens! I made a pretty rudamentary web UI for this, and would love to be able to use the Flickr model on it. Source code and public site: http://www.smartsketch.xyz

samuelpietri commented 5 years ago

@sidyakinian count me in as well, I just sent you an email

sidyakinian commented 5 years ago

@noyoshi Training will take 1-2 weeks.

As of being of help, the guys and I can go two ways: pull together our own GPU resources, or buy AWS. If we settle on the latter, anyone could be of help by pledging money; most likely only one of us (the most experienced one) will actually train the model.

mingyuliutw commented 5 years ago

As we mentioned in the GTC, we are making an online demo for everyone to play with the Flickr model (any mobile devices or desktops) and likely a standalone version for everyone with an NVIDIA GPU. Hopefully, it will not take too long.

sidyakinian commented 5 years ago

@mingyuliutw That's awesome! I've heard somewhere that it's going to be released in summer.

Flova commented 5 years ago

It's maybe a dumb question, but what's the main reason, of the missing Flickr model? Is it it's sheer size or are there any licencing issues?

bitcoinmeetups commented 5 years ago

Following

sidyakinian commented 5 years ago

@bitcoinmeetups Hi! Please email me your Telegram if you wanna join the chat

datar-ai commented 5 years ago

It's awesome ! count me in as well

mod-cpu commented 5 years ago

Very interested. Count me in

banyet1 commented 5 years ago

We've accomplished Flickr datasets(41K) training last week.

bensnell commented 5 years ago

@banyet1 What are your plans now that the model is complete? Do you have any intention of making it available to others? I would love to test it out.

genekogan commented 5 years ago

Hey everyone! We've finished training the model and it's available here https://drive.google.com/open?id=1QJr5HBv8PAjJuVNB9zf8EiA6IcIVCswa Here's a video of it in action: https://twitter.com/genekogan/status/1136261959970709504

Seth-Park commented 5 years ago

@genekogan Great work! Are there any plans to release the collected Flickr dataset?

taki0112 commented 5 years ago

@genekogan Nice !! Do you have any plans to release the dataset?

mingyuliutw commented 5 years ago

The official Flickr model is available as a web demo via https://www.nvidia.com/en-us/research/ai-playground/

Better models will likely come in the summer. Stay tuned.

aeti-in commented 5 years ago

Thanks, really appreciate.

From: Ming-Yu Liu 劉洺堉 notifications@github.com Sent: Friday, June 14, 2019 2:18 AM To: NVlabs/SPADE SPADE@noreply.github.com Cc: aeti-in info@aeti.in; Mention mention@noreply.github.com Subject: Re: [NVlabs/SPADE] [Idea] Crowd-training of unreleased Flickr landscape model (#17)

The official Flickr model is available as a web demo via https://www.nvidia.com/en-us/research/ai-playground/

Better models will likely come in the summer. Stay tuned.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NVlabs/SPADE/issues/17?email_source=notifications&email_token=AKSCPVKJNUQA2MXTRGV2OP3P2KXBVA5CNFSM4HFZXKT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODXU7SSQ#issuecomment-501872970 , or mute the thread https://github.com/notifications/unsubscribe-auth/AKSCPVN342JUIHH7US6UBWTP2KXBVANCNFSM4HFZXKTQ . https://github.com/notifications/beacon/AKSCPVOV3GXNKH3RO3ET45TP2KXBVA5CNFSM4HFZXKT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODXU7SSQ.gif

prusnak commented 5 years ago

@mingyuliutw Is there a plan to release the actual model, not just the tool to play with it?

huge123 commented 5 years ago

@genekogan Thanks for your efforts, could the flickr dataset be shared with us?

genekogan commented 5 years ago

i'm not sure if releasing the dataset would violate the licenses of the actual photos as they belong to other people. if we can release it, i have no problem with that.

aviel08 commented 5 years ago

Well that's an interesting subject we're facing with machine learning. I think we are not infringing any copyright law since we are not using any piece of the dataset in its implicit form but in novel way, I'm no lawyer of course.

On Tue, Jun 18, 2019 at 11:22 AM Gene Kogan notifications@github.com wrote:

i'm not sure if releasing the dataset would violate the licenses of the actual photos as they belong to other people. if we can release it, i have no problem with that.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/NVlabs/SPADE/issues/17?email_source=notifications&email_token=ABA62PCO6B4TUTBOX7JMMV3P3A2EPA5CNFSM4HFZXKT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODX44KSI#issuecomment-502908233, or mute the thread https://github.com/notifications/unsubscribe-auth/ABA62PGUUJCB4VYAM3BAOATP3A2EPANCNFSM4HFZXKTQ .

huge123 commented 5 years ago

i'm not sure if releasing the dataset would violate the licenses of the actual photos as they belong to other people. if we can release it, i have no problem with that.

It is indeed a tough problem. What methods you used to extract the sematic layout of flickr images, you annotated some samples and trained the model on your own, or used existing model?

aviel08 commented 5 years ago

What methods you used to extract the sematic layout of flickr images, you annotated some samples and trained the model on your own, or used existing model?

I always create my own datasets, either creating my labels or generating them from a 3D application but I know this a very specific scenario.

prusnak commented 5 years ago

@genekogan Flickr contains lots of photos which are licensed under the Creative Commons license. If you pick only these for training, I am pretty sure the trained model could be published under the same license again.

huge123 commented 5 years ago

@aviel08 @genekogan Is it permissible to share some sematic labels for the Flickr images along with the dataset/dataloader script, I just want to test the pretrained model, thanks.