[BUG] Compute Uniqueness not working on Windows

sourabhyadav commented 3 years ago

Instructions

I tried to check for finding duplicate images on local dataset. However, I am facing the following issue.

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows 10
FiftyOne installed from (pip or source): pip3 install fiftyone
FiftyOne version (run fiftyone --version): default
Python version: python 3.9

Commands to reproduce

import fiftyone as fo
dataset = fo.Dataset.from_images_dir("D:\\data\\similar\\class1", recursive=True, name="sim-images")  
session = fo.launch_app(dataset)
import fiftyone.brain as fob
fob.compute_uniqueness(dataset)

Describe the problem

It looks like there is some problem with model loading.

Loading uniqueness model...
Downloading model from Google Drive ID '1SIO9XreK0w1ja4EuhBWcR10CnWxCOsom'...
 100% |██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████|  100.6Mb/100.6Mb [1.4s elapsed, 0s remaining, 85.2Mb/s]
Uncaught exception
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<frozen fiftyone.brain>", line 146, in compute_uniqueness
  File "<frozen fiftyone.brain.internal.core.uniqueness>", line 67, in compute_uniqueness
  File "<frozen fiftyone.brain.internal.core.uniqueness>", line 88, in _load_model
  File "<frozen fiftyone.brain.internal.models>", line 183, in load_model
  File "C:\Users\soura\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\fiftyone\core\models.py", line 472, in load_model
    return config.build()
  File "C:\Users\soura\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\eta\core\learning.py", line 296, in build
    return self._model_cls(self.config)
  File "C:\Users\soura\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\fiftyone\utils\torch.py", line 535, in __init__
    self._model = self._load_model(config)
  File "C:\Users\soura\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\fiftyone\utils\torch.py", line 730, in _load_model
    self._load_state_dict(model, config)
  File "<frozen fiftyone.brain.internal.models.torch>", line 44, in _load_state_dict
  File "C:\Users\soura\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\torch\nn\modules\module.py", line 1051, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for Network:
        Missing key(s) in state_dict: "prep\conv.weight", "prep\bn.weight", "prep\bn.bias", "prep\bn.running_mean", "prep\bn.running_var", "layer1\conv.weight", "layer1\bn.weight", "layer1\bn.bias", "layer1\bn.running_mean", "layer1\bn.running_var", "layer1\residual\res1\conv.weight", "layer1\residual\res1\bn.weight", "layer1\residual\res1\bn.bias", "layer1\residual\res1\bn.running_mean", "layer1\residual\res1\bn.running_var", "layer1\residual\res2\conv.weight", "layer1\residual\res2\bn.weight", "layer1\residual\res2\bn.bias", "layer1\residual\res2\bn.running_mean", "layer1\residual\res2\bn.running_var", "layer2\conv.weight", "layer2\bn.weight", "layer2\bn.bias", "layer2\bn.running_mean", "layer2\bn.running_var", "layer3\conv.weight", "layer3\bn.weight", "layer3\bn.bias", "layer3\bn.running_mean", "layer3\bn.running_var", "layer3\residual\res1\conv.weight", "layer3\residual\res1\bn.weight", "layer3\residual\res1\bn.bias", "layer3\residual\res1\bn.running_mean", "layer3\residual\res1\bn.running_var", "layer3\residual\res2\conv.weight", "layer3\residual\res2\bn.weight", "layer3\residual\res2\bn.bias", "layer3\residual\res2\bn.running_mean", "layer3\residual\res2\bn.running_var".
        Unexpected key(s) in state_dict: "prep_conv.weight", "prep_bn.weight", "prep_bn.bias", "prep_bn.running_mean", "prep_bn.running_var", "prep_bn.num_batches_tracked", "layer1_conv.weight", "layer1_bn.weight", "layer1_bn.bias", "layer1_bn.running_mean", "layer1_bn.running_var", "layer1_bn.num_batches_tracked", "layer1_residual_res1_conv.weight", "layer1_residual_res1_bn.weight", "layer1_residual_res1_bn.bias", "layer1_residual_res1_bn.running_mean", "layer1_residual_res1_bn.running_var", "layer1_residual_res1_bn.num_batches_tracked", "layer1_residual_res2_conv.weight", "layer1_residual_res2_bn.weight", "layer1_residual_res2_bn.bias", "layer1_residual_res2_bn.running_mean", "layer1_residual_res2_bn.running_var", "layer1_residual_res2_bn.num_batches_tracked", "layer2_conv.weight", "layer2_bn.weight", "layer2_bn.bias", "layer2_bn.running_mean", "layer2_bn.running_var", "layer2_bn.num_batches_tracked", "layer3_conv.weight", "layer3_bn.weight", "layer3_bn.bias", "layer3_bn.running_mean", "layer3_bn.running_var", "layer3_bn.num_batches_tracked", "layer3_residual_res1_conv.weight", "layer3_residual_res1_bn.weight", "layer3_residual_res1_bn.bias", "layer3_residual_res1_bn.running_mean", "layer3_residual_res1_bn.running_var", "layer3_residual_res1_bn.num_batches_tracked", "layer3_residual_res2_conv.weight", "layer3_residual_res2_bn.weight", "layer3_residual_res2_bn.bias", "layer3_residual_res2_bn.running_mean", "layer3_residual_res2_bn.running_var", "layer3_residual_res2_bn.num_batches_tracked".

What areas of FiftyOne does this bug affect?

[ ] App: FiftyOne application issue
[x] Core: Core fiftyone Python library issue
[ ] Server: Fiftyone server issue

Willingness to contribute

The FiftyOne Community encourages bug fix contributions. Would you or another member of your organization be willing to contribute a fix for this bug to the FiftyOne codebase?

[ ] Yes. I can contribute a fix for this bug independently.
[ ] Yes. I would be willing to contribute a fix for this bug with guidance from the FiftyOne community.
[x] No. I cannot contribute a bug fix at this time.

benjaminpkane commented 3 years ago

I don't have any reason to believe that this is a FiftyOne issue just yet, though I am still investigating.

Could you provide your PyTorch version? It may be that upgrading PyTorch will solve the issue, although it is only a suggestion.

sourabhyadav commented 3 years ago

Yeah, it seems there is a problem with model loading. FYI, it is on Windows there won't be Cuda support.

torch: 1.7.1
torchvision: 0.2.2.post3

brimoor commented 3 years ago

Hmm, 0.2.2.post3 is a fairly old version of torchvision. For example, I'm running torchvision==0.8.2 with torch==1.7.1. However, I downgraded to 0.2.2.post3 on macOS (also CPU only) and was able to run compute_uniqueness() with no problem.

Looking at the error message, this seems to be a windows path problem. The stack trace is complaining that variable\name.weight is expected but variable_name.weight is found instead.

Are you able to try upgrading your torchvision version? Perhaps this is something that the torchvision team has fixed in later versions. I briefly checked online and didn't find anything about this...

sourabhyadav commented 3 years ago

I updated the torch and torch visoin to:

torch                         1.7.1+cpu
torchaudio                    0.7.2
torchvision                   0.8.2+cpu

But the issue still remains the same. Did anyone has tried on Windows PC?

benjaminpkane commented 3 years ago

I updated the torch and torch visoin to:
torch                         1.7.1+cpu
torchaudio                    0.7.2
torchvision                   0.8.2+cpu
But the issue still remains the same. Did anyone has tried on Windows PC?

Hopefully today, I need to spin up a Windows machine. Apologies.

ShaneGilroy commented 3 years ago

Any update on this issue?

brimoor commented 3 years ago

Hi @sourabhyadav @ShaneGilroy

No update on getting the default model used by compute_uniqueness() to work on Windows yet.

However, let me tell you a secret: you can compute your own embeddings and pass them to the method instead, which will work in any environment. For example, you can use any model that exposes embeddings from the FiftyOne Model Zoo.

Here's a handy command FiftyOne CLI command to see what models are available:

fiftyone zoo models list --tags embeddings

And here's an example workflow:

import fiftyone as fo
import fiftyone.brain as fob
import fiftyone.zoo as foz

dataset = foz.load_zoo_dataset("quickstart")

# Remove existing uniqueness field
dataset = dataset.exclude_fields("uniqueness").clone()

session = fo.launch_app(dataset)

# Compute your own embeddings
model = foz.load_zoo_model("mobilenet-v2-imagenet-torch")
embeddings = dataset.compute_embeddings(model)

# Index by uniqueness using pre-computed embeddings
fob.compute_uniqueness(dataset, embeddings=embeddings)
print(dataset)

# Show least unique images in the App
session.view = dataset.sort_by("uniqueness")

However, let me also instead recommend that you take a look at visual similarity rather than uniqueness, which many users find to be more useful in practice. Similar idea, but more flexible.

For example, continuing from above:

# Index by visual similarity
fob.compute_similarity(dataset, embeddings=embeddings, brain_key="img_sim")

Then you can use the App to sort by visual similarity to samples of interest, or you can follow this workflow to find near-duplicate images.

TommeTao commented 2 years ago

However, let me tell you a secret: you can compute your own embeddings and pass them to the method instead, which will work in any environment.

@brimoor Thanks for the hint. I am wondering: What is the difference between compute_uniqueness() and compute_similarity() then? Your hint seems to suggest that the methodological approach/main algorithms are the same for both methods, is that correct? And then both methods provide somewhat different updates to the dataset. Like compute_uniqueness() adds a uniqueness score as field while compute_similarity() provides some methods to be used on the results!?

brimoor commented 2 years ago

The two methods both use deep embeddings, but they do slightly different things with them.

But the main point here is that the two methods also use different default models to generate embeddings. The recommendation was to try a different model like "mobilenet-v2-imagenet-torch" from our Model Zoo if you are a Windows user trying to use compute_uniqueness(), since that model will work while the default one does not currently seem to work.

Sejal1506 commented 1 year ago

Any update on this issue?

benjaminpkane commented 1 year ago

Hi @Sejal1506. We don't have an update on getting the default model used by compute_uniqueness() to work on Windows yet. It seems to be some part of PyTorch that is rewriting the state dict. Note that fiftyone-brain is no longer a frozen package, which means full stack traces are available. Sharing any findings could expedite a solution

voxel51 / fiftyone