RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 0. Got 3 and 1 in dimension 1 at /pytorch/aten/src/TH/generic/THTensor.cpp:612

joausaga commented 4 years ago

I have the following error when trying to predict the demographics of a list of twitter users.

Predicting...:   0%|                                                                                                                                                        | 36/54307 [04:36<107:30:38,  7.13s/it]
File ".../src/utils/demographic_detector.py", line 43, in infer                                                                                                           [5/1807]
    predictions = self.m3twitter.infer(user_objs)                                                                                                                                                                  
  File ".../.conda/envs/twcovid/lib/python3.7/site-packages/m3inference/m3inference.py", line 125, in infer                                                                                        
    for batch in tqdm(dataloader, desc='Predicting...'):                                                                                                                                                           
  File ".../.conda/envs/twcovid/lib/python3.7/site-packages/tqdm/std.py", line 1108, in __iter__                                                                                                   
    for obj in iterable:                                                                                                                                                                                           
  File ".../.conda/envs/twcovid/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 345, in __next__
    data = self._next_data()
  File ".../.conda/envs/twcovid/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 856, in _next_data
    return self._process_data(data)
  File ".../.conda/envs/twcovid/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 881, in _process_data
    data.reraise()
  File ".../.conda/envs/twcovid/lib/python3.7/site-packages/torch/_utils.py", line 394, in reraise
    raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File ".../.conda/envs/twcovid/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
    data = fetcher.fetch(index)
  File ".../.conda/envs/twcovid/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch
    return self.collate_fn(data)
  File ".../.conda/envs/twcovid/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py", line 79, in default_collate
    return [default_collate(samples) for samples in transposed]
  File ".../.conda/envs/twcovid/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py", line 79, in <listcomp>
    return [default_collate(samples) for samples in transposed]
  File ".../.conda/envs/twcovid/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py", line 55, in default_collate
    return torch.stack(batch, 0, out=out)
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 0. Got 3 and 1 in dimension 1 at /pytorch/aten/src/TH/generic/THTensor.cpp:612

The list of users can be found here

zijwang commented 4 years ago

Did you install m3inference from pip? If so, could you see whether installing from the master branch helps?

joausaga commented 4 years ago

No, the problem persists. Do JSON lines in the file are read in order from top to bottom? if this is the case, it always breaks at line 36, which is this user https://twitter.com/JUc3m. Could black and white profile pictures be problematic?

zijwang commented 4 years ago

Yes, it should be in order. Have you checked whether you have that image on your disk, and if so, whether you are able to open it?

joausaga commented 4 years ago

Yes, I have it in my disk. I could open, convert to RGB, resize, and transform it into a tensor.

computermacgyver commented 4 years ago

Thanks, Jorge, for helping us get to the bottom of this. I put that one user in a separate jsonl file (just one line) and downloaded the profile from Twitter. It ran for me without any issue.

Do you think you could try the same ? I.e., put just that user in a file and run the infer method on that file?

The infer works in batches; so, my suspicion is that it is not this line / user in particular but one around it. I haven't download the profile photos for other users to test that yet, but if the one line / one user runs for you then perhaps you could try adding additional users until it breaks?

Would you also be able to confirm the version of Python you're using and the OS (Windows, Mac, Linux)? We've tested extensively on Linux and Mac, but I have seen a few issues popping up on Windows.

computermacgyver commented 4 years ago

You can also run with the parameters batch_size=1, num_workers=1 to help better isolate the failing user.

m3.infer('tmp.jsonl',batch_size=1,num_workers=1)

joausaga commented 4 years ago

Great thanks Scott! (@computermacgyver ), found the problem. It seems we might need to extend m3inference to support gif images. This user has a gif as her profile picture. The predictor breaks because when transforming the gif image into a tensor, the resulting tensor is of size 3x224x224 and not 1x224x224 as expected. I guess the first dimension 3 is because the gif is composed of three images, which are used to perform the animation.

I am running on Python 3.7.1 in a Linux/Ubuntu machine.

zijwang commented 4 years ago

Hey @joausaga! We have updated the package and I think the issue with gif has been resolved. Could you try out the new version (v1.1.0) and see whether it works?

Here is what I did based on your example:

> python scripts/m3twitter.py --screen-name hermanas_malas --auth ./scripts/auth_example.txt --skip-cache

08/13/2020 10:50:52 - INFO - m3inference.m3inference -   Version 1.1.0
08/13/2020 10:50:52 - INFO - m3inference.m3inference -   Running on cpu.
08/13/2020 10:50:52 - INFO - m3inference.m3inference -   Will use full M3 model.
08/13/2020 10:50:53 - INFO - m3inference.m3inference -   Model full_model exists at [masked_link]/full_model.mdl.
08/13/2020 10:50:53 - INFO - m3inference.utils -   Checking MD5 for model full_model at [masked_link]/full_model.mdl
08/13/2020 10:50:54 - INFO - m3inference.utils -   MD5s match.
08/13/2020 10:50:54 - INFO - m3inference.m3inference -   Loaded pretrained weight at [masked_link]/full_model.mdl
08/13/2020 10:50:54 - INFO - m3inference.m3twitter -   skip_cache is True. Fetching data from Twitter for hermanas_malas.
08/13/2020 10:50:54 - INFO - m3inference.m3twitter -   GET /users/show.json?screen_name=hermanas_malas
08/13/2020 10:50:54 - INFO - m3inference.dataset -   1 data entries loaded.
Predicting...: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:01<00:00,  1.20s/it]
{'input': {'description': 'No soy ni un troll ni un robot, me tienen harta!!! '
                          'Digo lo que pienso, nada más que eso. Si no les '
                          'gusta, no me lean! Podrida de la Korrupción. Quiero '
                          'Justicia!',
           'id': '351160731',
           'img_path': '[masked_link]/hermanas_malas_224x224.png',
           'lang': 'es',
           'name': 'Pía Ferrer 🐱',
           'screen_name': 'hermanas_malas'},
 'output': {'age': {'19-29': 0.4107,
                    '30-39': 0.1334,
                    '<=18': 0.3018,
                    '>=40': 0.154},
            'gender': {'female': 0.8792, 'male': 0.1208},
            'org': {'is-org': 0.8338, 'non-org': 0.1662}}}

joausaga commented 4 years ago

Hi @zijwang, great improvement of the tool! I try it out and let you know if there is any trouble

computermacgyver commented 4 years ago

Thanks, @joausaga . We appreciate your help in discovering and diagnosing the issue. To be clear, we have updated the preprocessing code; so, images downloaded with the M3Twitter wrapper or preprocessed with scripts/preprocess.py will automatically convert animated GIFs to non-animated PNG/JPEG formats.

If you pass an animated GIF directly to the .infer(...) method the code will still fail. We're open to possibly checking and reformatting images there, but in general we expect images to have been preprocessed already (e.g., we do not check the dimensions of images in the .infer(...) method, but expect them to already be properly sized.

I don't think this applies to your use case, but if you use the M3Twitter wrapper to fetch user profile information, it now requires an API key. Details are in README.md.

joausaga commented 4 years ago

Oh, good to know. I directly use transform_jsonl_object from m3twitter.py, which uses download_resize_img. I don't see any change in the definition of theses functions, so I assume I am safe.

Where is the processing of gifs happening?

computermacgyver commented 4 years ago

Yes, no API keys are needed for those functions.

When you call transform_jsonl_object any of the paths that involve the image being resized call get_extension. If the file extension is ".gif", the function returns ".png" and this new filename is used as the output file and format for any resized/downloaded image. Just looking at this method specifically, I see that this doesn't happen if resize_img=False, which is something we might want to consider.

I will take a closer look specifically at that preprocessing method (which is one we definitely intend to support) to make sure the conversion is happening.

euagendas / m3inference

RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 0. Got 3 and 1 in dimension 1 at /pytorch/aten/src/TH/generic/THTensor.cpp:612 #5