Closed IsaacKam closed 5 years ago
Hi @IsaacKam--thanks for raising this issue!
Good catch here. It looks like for the segmentation decoders I accidentally set the wrong default output size. The output size should be 64 channels, not 128. I just pushed a fix that you can apply on your end by running the following:
pip uninstall visualpriors
pip install https://github.com/alexsax/midlevel-reps/archive/visualpriors-v0.3.1.zip
Aside from the above shape issues, I also want to note that I imagine the decodings are primarily useful for debugging. Visualizing those outputs will give you confidence that everything is working correctly.
For learning, though I've found the encodings to be generally more useful than the decodings. This is because the encodings all have a homogeneous shape (8 x 16 x 16
), while the decodings can take various forms. For example segment_unsup2d
produces a 64-channel image, while class_object
is a 1000-dimensional vector. And using the encodings doesn't really sacrifice anything: I've anecdotally found that downstream performance using the encodings is usually at least as good, if not better, than using the decodings.
I'm closing this issue for now, but if the above doesn't solve your problem then please feel free to reopen.
This is really useful :), thank you for the prompt reply. For learning from the encodings, what would you recommend as the best way to utilise them. I.e would you flatten it at this point and apply linear layers or is there a benefit of apply some conv layers here.
Hi Alex, it now seems to output a torch.Size([1, 64, 256, 256]) tensor when i use 'segment_unsup2d' is this supposed to be correct, if so what do the channels represent (different segments?)
Thanks for this great piece of work! When I change the template code from 'normal' to 'segment_unsup2d' like this:
I get the following errorw which is being causes by the feature_readout function:
Let me know if i'm doing something wrong.