euagendas / m3inference

A deep learning system for demographic inference (gender, age, and individual/person) that was trained on massive Twitter dataset using profile images, screen names, names, and biographies
http://www.euagendas.org
GNU Affero General Public License v3.0
145 stars 57 forks source link

Predicting...0% #25

Closed isaac1902 closed 2 years ago

isaac1902 commented 2 years ago

Hi,

I tried using the library with text_mode and it works fine. When I use the full_mode prediction doesn't work but I don't get any error. It is just stucked

This is basically my code:

m3 = M3Inference(use_full_model=True) 
        preprocess.download_resize_img(pic_url, "profile_pic.jpg", "profile_pic_fs.jpg")

        with open('data.jsonl', 'w') as outfile:
            for entry in data_set:
                json.dump(entry, outfile)
                outfile.write('\n')

        pred = m3.infer('data.jsonl')

This is the output:

10/04/2021 15:47:05 - INFO - m3inference.m3inference -   Version 1.1.5
10/04/2021 15:47:05 - INFO - m3inference.m3inference -   Running on cpu.
10/04/2021 15:47:05 - INFO - m3inference.m3inference -   Will use full M3 model.
10/04/2021 15:47:06 - INFO - m3inference.m3inference -   Model full_model exists at /Users/vv/m3/models/full_model.mdl.
10/04/2021 15:47:06 - INFO - m3inference.utils -   Checking MD5 for model full_model at /Users/vv/m3/models/full_model.mdl
10/04/2021 15:47:06 - INFO - m3inference.utils -   MD5s match.
10/04/2021 15:47:06 - INFO - m3inference.m3inference -   Loaded pretrained weight at /Users/vv/m3/models/full_model.mdl
10/04/2021 15:47:06 - INFO - m3inference.dataset -   1 data entries loaded.
Predicting...:   0%|          | 0/1 [00:00<?, ?it/s]

Any idea about the issue?

Thanks.

computermacgyver commented 2 years ago

Could you try with just a single account that you don't mind sharing and share data.jsonl . The account could belong to some celebrity, etc. to avoid any personal information being there.

computermacgyver commented 2 years ago

It might also help if you can run pip freeze > environment.txt to share the current versions of all the libraries you have installed as well as python --version to share the Python version and let us know the OS you're using. We've tested most extensively on Linux as well as a bit on Mac. We unfortunately don't have any Windows machines but can try to diagnose any Windows issues together.

isaac1902 commented 2 years ago

Hi @computermacgyver

I'm on Mac OS, python version is 3.7.6.

data.jsonl: {"id": "2829", "name": "Nadal", "screen_name": "rafanadal", "description": "Sono andato a Roma ieri", "lang": "it", "img_path": "profile_pic.jpg"}

Attached environment.txt file.

Thanks. environment.txt .

zijwang commented 2 years ago

@isaac1902 could you see when waiting at 0% whether your CPU utilization is high?

isaac1902 commented 2 years ago

@zijwang CPU utilization looks ok.

isaac1902 commented 2 years ago

@computermacgyver @zijwang any idea about the issue? Many thanks.

computermacgyver commented 2 years ago

I haven't yet been able to reproduce this @isaac1902 , but will continue to try. One thing that might be worth trying is to force the model to CPU-only. This would just involve changing the first line of your code example to m3 = M3Inference(use_full_model=True, use_cuda=False)

isaac1902 commented 2 years ago

@computermacgyver I tried forcing CPU-only model but nothing changed (actually it was already running on CPU).

isaac1902 commented 2 years ago

Any news anybody?

computermacgyver commented 2 years ago

@isaac1902 I've installed the version of Python you indicated and installed all the packages with the version in the environment.txt file you supplied, but have been unable to replicate the issue. I'd suggest that you try creating a fresh conda environment and reinstall the needed packages. If the problem persists, it would be useful to have a full minimal example. I do note that the profile in your example does not have a profile photo set. Your code doesn't show what value is being used for pic_url, which could be a point of difference with my testing.

# Environment setup
$ conda create -n py376test python==3.7.6
$ conda activate py376test
$ cat environment.txt | xargs -n 1 pip install
$ python
>>> data="""{"id": "2829", "name": "Nadal", "screen_name": "rafanadal", "description": "Sono andato a Roma ieri", "lang": "it", "img_path": "profile_pic.jpg"}"""
>>> with open("data.jsonl","w") as fh:
...   fh.write(data)
... 
146
>>> import m3inference
>>> m3 = m3inference.M3Inference(use_full_model=True)
10/12/2021 13:27:55 - INFO - m3inference.m3inference -   Version 1.1.5
10/12/2021 13:27:55 - INFO - m3inference.m3inference -   Running on cpu.
10/12/2021 13:27:55 - INFO - m3inference.m3inference -   Will use full M3 model.
10/12/2021 13:27:56 - INFO - m3inference.m3inference -   Model full_model exists at /home/shale/m3/models/full_model.mdl.
10/12/2021 13:27:56 - INFO - m3inference.utils -   Checking MD5 for model full_model at /home/shale/m3/models/full_model.mdl
10/12/2021 13:27:56 - INFO - m3inference.utils -   MD5s match.
10/12/2021 13:27:56 - INFO - m3inference.m3inference -   Loaded pretrained weight at /home/shale/m3/models/full_model.mdl
>>> m3inference.preprocess.download_resize_img("https://abs.twimg.com/sticky/default_profile_images/default_profile_400x400.png", "profile_pic.jpg", "profile_pic_fs.jpg")
>>> pred = m3.infer("data.jsonl")
10/12/2021 13:28:12 - INFO - m3inference.dataset -   1 data entries loaded.
Predicting...: 100%|████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  4.98it/s]
>>> pred
OrderedDict([('2829', {'gender': {'male': 0.9958, 'female': 0.0042}, 'age': {'<=18': 0.0499, '19-29': 0.158, '30-39': 0.7235, '>=40': 0.0686}, 'org': {'non-org': 0.9713, 'is-org': 0.0287}})])
>>>