BCV-Uniandes / DMS

Dynamic Multimodal Instance Segmentation Guided by Natural Language Queries, ECCV 2018
https://biomedicalcomputervision.uniandes.edu.co
MIT License
75 stars 8 forks source link

Testing on Custom Dataset #32

Closed Shivanshmundra closed 5 years ago

Shivanshmundra commented 5 years ago

Can you guide me test on my own dataset? How data needs to be created and things to keep in mind?

andfoy commented 5 years ago

Hi, thanks for contacting us.

To use DMN with a custom dataset, you just require images and referral expressions. You can see our loader for RefCOCO* datasets, just to get an intuition about how to load data into our model.

Please keep in mind that if you are going to use the provided pretrained weights, the new vocabulary must be merged with the RefCOCO one, otherwise, the output is going to be empty.

Shivanshmundra commented 5 years ago

Hi @andfoy, Thanks for such fast reply. So I need to choose sentences from vocabulary from already present in dataset. Dataloader also has masks, do I need to generate masks from maskrcnn model or they are extracted in model itself? Also, while running evaluation script(say on unc data) , how to visualize images, their corresponding segmentation mask and referring expressions? All I see after running script is scores.

andfoy commented 5 years ago

So I need to choose sentences from vocabulary from already present in dataset. Dataloader also has masks, do I need to generate masks from maskrcnn model or they are extracted in model itself?

The masks are only required for evaluation purposes, if you only want to do inference on DMN, you only require an image and a referral expression.

andfoy commented 5 years ago

how to visualize images, their corresponding segmentation mask and referring expressions? All I see after running script is scores

We have a separate script for visualization purposes that uses Visdom as graphical backend, please take a look at it to get more insights about how to visualize DMN's output segmentation maps

andfoy commented 5 years ago

So I need to choose sentences from vocabulary from already present in dataset

Preferrably, however, the RefCOCO vocabulary has a large, comprenhensive corpus at least for common use words. If a word is not found, it will be replaced by the special token <UNK>

Shivanshmundra commented 5 years ago

Thanks a lot @andfoy . I will try things keeping in mind.

Shivanshmundra commented 5 years ago

@andfoy Thanks. I was able to visualize through visdom interface. Although there are two things-

andfoy commented 5 years ago

In output mask there is only grey picture

Are you running the visualization routine on high-resolution? This can be enabled by passing the flag --high-res to the script

andfoy commented 5 years ago

Caption is not visualized in image. I tried changing caption to title here but it resulted in very small text display with a lot of .

That means that there are discrepancies between the loaded vocabulary and your phrases, could you please check the Corpus object inside the loader?

Shivanshmundra commented 5 years ago

After I ran with high-res flag, this is the output: 0_ Query__ sofa _unk_ against _unk_ wall Is this supposed to be like this?

Shivanshmundra commented 5 years ago

Caption is not visualized in image. I tried changing caption to title here but it resulted in very small text display with a lot of .

That means that there are discrepancies between the loaded vocabulary and your phrases, could you please check the Corpus object inside the loader?

I used referit dataset only and tried different splits, Is it still supposed to give <unk> to some words?

Shivanshmundra commented 5 years ago

Hi @andfoy, ping!

andfoy commented 5 years ago

Hi @andfoy, ping!

Sorry @Shivanshmundra, but we the maintainers have other tasks to do, just because I was able to answer on this issue quickly, that does not mean that we are able to answer as quick as you would like the whole time. So please be more comprehensive with our time and work and refrain from making these kind of rushed requests.

andfoy commented 5 years ago

I used referit dataset only and tried different splits, Is it still supposed to give to some words?

Which pretrained weights did you use?

Shivanshmundra commented 5 years ago

I am really sorry @andfoy . I know there might be a lot of workload on you, I was under impression that you replied so quickly on previous issue that I thought latest discussion was slipped in your inbox. You can reply according to your convenience. I made these rushed requests because there was a sudden work came to me which had close deadline but I will make sure not to disturb you anymore.

Shivanshmundra commented 5 years ago

I used referit dataset only and tried different splits, Is it still supposed to give to some words?

Which pretrained weights did you use?

I used UNC dataset high resolution pretrained weights. I can somehow extract phrases by tweaking dataloader. I was more concerned about results(segmentation masks) I got from pretrained model. I think this is probably related to #31 also. I am afraid I am doing something wrong, but I followed steps sequentially, so not able to figure out what. Again, Sorry for your inconvenience, you can reply as you get time.

andfoy commented 5 years ago

I used UNC dataset high resolution pretrained weights. I can somehow extract phrases by tweaking dataloader. I was more concerned about results(segmentation masks) I got from pretrained model. I think this is probably related to #31 also. I am afraid I am doing something wrong, but I followed steps sequentially, so not able to figure out what.

Would you please share the command you are using to visualize the masks?

andfoy commented 5 years ago

I made these rushed requests because there was a sudden work came to me which had close deadline but I will make sure not to disturb you anymore.

Don't worry, precisely this repository is public and open to issues and pull requests, such that contributors are able to ask questions or suggest improvements.

Shivanshmundra commented 5 years ago

I used UNC dataset high resolution pretrained weights. I can somehow extract phrases by tweaking dataloader. I was more concerned about results(segmentation masks) I got from pretrained model. I think this is probably related to #31 also. I am afraid I am doing something wrong, but I followed steps sequentially, so not able to figure out what.

Would you please share the command you are using to visualize the masks?

python -W ignore -m dmn_pytorch.visdom_display --data referit_data/ --split testB --dataset unc --snapshot weights/dmn_unc_weights.pth

andfoy commented 5 years ago

What happens if you run python -m dmn_pytorch.visdom_display --data referit_data --dataset unc --split testB --backend dpn92 --num-filters 10 --lang-layers 3 --mix-we --snapshot weights/dmn_unc_weights.pth --high-res

andfoy commented 5 years ago

By the way, by the -W ignore it might seem that you are using the latest PyTorch release. What happens if you downgrade to 1.0.1 at least? Also, how did you install the SRU?

Shivanshmundra commented 5 years ago

What happens if you run python -m dmn_pytorch.visdom_display --data referit_data --dataset unc --split testB --backend dpn92 --num-filters 10 --lang-layers 3 --mix-we --snapshot weights/dmn_unc_weights.pth --high-res

Woah! Now it is working properly, as it is shown in paper. I don't know what I was doing wrong. I will figure out <unk> thing on my own. By the way here is an example: image

Thanks a lot again, @andfoy !

Shivanshmundra commented 5 years ago

By the way, by the -W ignore it might seem that you are using the latest PyTorch release. What happens if you downgrade to 1.0.1 at least? Also, how did you install the SRU?

Yes, I was using latest Pytorch release. I installed SRU using the command given in README - pip install -U git+https://github.com/taolei87/sru.git@43c85ed --no-deps. I guess after previous message, I don't need to downgrade PyTorch version as it is working.

andfoy commented 5 years ago

@Shivanshmundra I'm glad to hear that you were able to visualize DMN outputs correctly. I guess I can close this one. If you have more questions, feel free to open a new issue.