Prof-Lu-Cewu / Visual-Relationship-Detection

Other
214 stars 60 forks source link

Training code and pre-trained model? #1

Open sxjzwq opened 8 years ago

sxjzwq commented 8 years ago

Hi,

Thanks for the code. It is a great work!

But can you also provide the training code? Then we can train our own W and b in Eq. (2). And is that possible to provide your pre-trained model for the object detection and predicates prediction model, so we can extract the 'objectDetRCNN.mat' and 'UnionCNNfeaPredicate.mat' for our own images.

Thanks!

erobic commented 8 years ago

@sxjzwq Were you able to train CNN to classify predicates? Paper states: "Similarly, we train a second CNN (VGG net [44]) to classify each of our K = 70 predicates using the union of the bounding boxes of the two participating objects in that relationship". I am just wondering how much accuracy we can receive for this CNN.

Prof-Lu-Cewu commented 8 years ago

Yes, we are about to train CNN to classify predicates,

but the accuracy is not good enough.


发件人: Robik Shrestha notifications@github.com 发送时间: 2016年10月8日 13:37 收件人: Prof-Lu-Cewu/Visual-Relationship-Detection 主题: Re: [Prof-Lu-Cewu/Visual-Relationship-Detection] Training code and pre-trained model? (#1)

@sxjzwqhttps://github.com/sxjzwq Were you able to train CNN to classify predicates? Paper states: "Similarly, we train a second CNN (VGG net [44]) to classify each of our K = 70 predicates using the union of the bounding boxes of the two participating objects in that relationship". Not sure how accurate this CNN would be.

― You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/Prof-Lu-Cewu/Visual-Relationship-Detection/issues/1#issuecomment-252404506, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ATVciT9jTLQJc-QkLsulqM4SGsMXmyMzks5qxywAgaJpZM4Jfok3.

sxjzwq commented 8 years ago

@erobic We didn't try to train a CNN to classify predicates. However, we tried to train a CNN to predict the whole relationships directly. We pre-build a vocabulary with 15,000 elements. In this vocabulary, the label is a triple such as <bag, on, table>, which means each label is a relationship instance. So <bag, on, table> and <bag,under,table> are two labels. We treat this as a multi-label classification problem and train a VGG net on the Visual Genome dataset. This sounds crazy but we actually got some reasonable results and we use it as our baseline. The drawbacks of this way is that 15000 relationships only cover around 70% relationships in the Visual Genome.

I still wish @Prof-Lu-Cewu can publish their training code, then we can try it on VG or other datasets.

ronghanghu commented 8 years ago

I still wish @Prof-Lu-Cewu can publish their training code, then we can try it on VG or other datasets.

+1

liqing-ustc commented 7 years ago

Ask for the training code +1

jesiws commented 7 years ago

+1
Many thanks!!!

wtliao commented 7 years ago

ask for complete training code, too. It's really a great work. I'm trying to build the whole framework, but to many problems occurred. So I wish the sharing of the code to help me understand the proposed approach better.

Many thanks!

yanxp commented 7 years ago

Ask for the training code +1

nikita0511 commented 7 years ago

asking for training code +1

singhanj13 commented 6 years ago

Asking for the training code +1 !

jamesben6688 commented 6 years ago

Asking for the training code +1

Narcissuscyn commented 6 years ago

Asking for the training code +1

thanks very much!

fomalhaut-b commented 3 years ago

I know this thread is way too old but.......training code would be great!

+1!

Thanks, awesome work

samahwaleed commented 3 years ago

Asking for the training code

YOUNG-bit commented 2 years ago

Asking for the training code!!