euagendas / m3inference

A deep learning system for demographic inference (gender, age, and individual/person) that was trained on massive Twitter dataset using profile images, screen names, names, and biographies
http://www.euagendas.org
GNU Affero General Public License v3.0
145 stars 57 forks source link

fix urllib errors while trying to fetch a profile image from twitter #20

Closed Simone-Alghisi closed 3 years ago

Simone-Alghisi commented 3 years ago

Added more exceptions in preprocess.py to handle urllib remaining errors like ContentTooShortError, which occurred while I was fetching profile images from Twitter.

The same goes for ValueError, which I have encountered when the field profile_image_url_https in the twitter json was empty (i.e. "")

At last, I have added a line in m3twitter.py to verify if the profile image was successfully downloaded: if that's not the case, TW_DEFAULT_PROFILE_IMG is used to avoid crash during the infer phase.

computermacgyver commented 3 years ago

Thank you, Simone. These are great changes.

Simone-Alghisi commented 3 years ago

Thank you, Simone. These are great changes.

Thanks, that was my first pull request so I'm glad it was good!