twintproject / twint

An advanced Twitter scraping & OSINT tool written in Python that doesn't use Twitter's API, allowing you to scrape a user's followers, following, Tweets and more while evading most API limitations.
MIT License
15.66k stars 2.72k forks source link

[ERROR] MemoryError: Cannot allocate write+execute memory for ffi.callback() #571

Closed williamgunn closed 4 years ago

williamgunn commented 4 years ago

Issue Template

Please use this template!

Initial Check

Make sure you've checked the following:

3.6.8, but same error occurs with 3.6.2 or 3.7.1

Command Ran

Please provide the exact command ran including the username/search/code so I may reproduce the issue.

Just running twint at the CLI with no arguments creates the error, as well as with arguments supplied, such as: twint -s "#openaccess" -o oa_tweets.csv --count --stats

A similar error results after running import twint in the Python interpreter.

Description of Issue

twint does not scrape, but rather sends the following Traceback message to the console.

Traceback (most recent call last): File "/home/williamgunn/.local/bin/twint", line 11, in <module> load_entry_point('twint', 'console_scripts', 'twint')() File "/home/williamgunn/opt/python-3.6.8/lib/python3.6/site-packages/pkg_resources/__init__.py", line 487, in load_entry_point return get_distribution(dist).load_entry_point(group, name) File "/home/williamgunn/opt/python-3.6.8/lib/python3.6/site-packages/pkg_resources/__init__.py", line 2728, in load_entry_point return ep.load() File "/home/williamgunn/opt/python-3.6.8/lib/python3.6/site-packages/pkg_resources/__init__.py", line 2346, in load return self.resolve() File "/home/williamgunn/opt/python-3.6.8/lib/python3.6/site-packages/pkg_resources/__init__.py", line 2352, in resolve module = __import__(self.module_name, fromlist=['__name__'], level=0) File "/home/williamgunn/src/twint/twint/__init__.py", line 14, in <module> from . import run File "/home/williamgunn/src/twint/twint/run.py", line 5, in <module> from . import datelock, feed, get, output, verbose, storage File "/home/williamgunn/src/twint/twint/get.py", line 6, in <module> import aiohttp File "/home/williamgunn/.local/lib/python3.6/site-packages/aiohttp/__init__.py", line 6, in <module> from .client import BaseConnector as BaseConnector File "/home/williamgunn/.local/lib/python3.6/site-packages/aiohttp/client.py", line 69, in <module> from .connector import BaseConnector as BaseConnector File "/home/williamgunn/.local/lib/python3.6/site-packages/aiohttp/connector.py", line 57, in <module> from .resolver import DefaultResolver File "/home/williamgunn/.local/lib/python3.6/site-packages/aiohttp/resolver.py", line 11, in <module> import aiodns File "/home/williamgunn/.local/lib/python3.6/site-packages/aiodns/__init__.py", line 4, in <module> import pycares File "/home/williamgunn/.local/lib/python3.6/site-packages/pycares/__init__.py", line 88, in <module> @_ffi.callback("void (void *data, ares_socket_t socket_fd, int readable, int writable )") MemoryError: Cannot allocate write+execute memory for ffi.callback(). You might be running on a system that prevents this. For more information, see https://cffi.readthedocs.io/en/latest/using.html#callbacks

Environment Details

Using Windows, Linux? What OS version? Running this in Anaconda? Jupyter Notebook? Terminal?

Ubuntu 18.04.2 LTS on Dreamhost

Python packages Package Version Location


aiodns 2.0.0 aiohttp 3.6.2 aiohttp-socks 0.2.2 async-timeout 3.0.1 attrs 19.3.0 beautifulsoup4 4.8.1 cchardet 2.1.4 cffi 1.13.2 chardet 3.0.4 elasticsearch 7.0.5 fake-useragent 0.1.11 geographiclib 1.50 geopy 1.20.0 idna 2.8 idna-ssl 1.1.0 multidict 4.5.2 numpy 1.17.3 pandas 0.25.3 pip 19.3.1 pycares 3.0.0 pycparser 2.19 PySocks 1.7.1 python-dateutil 2.8.1 pytz 2019.3 schedule 0.6.0 setuptools 40.6.2 six 1.13.0 soupsieve 1.9.5 twint 2.1.7 /home/williamgunn/src/twint typing 3.7.4.1 typing-extensions 3.7.4.1 urllib3 1.25.6 yarl 1.3.0

williamgunn commented 4 years ago

I've checked with my shared hosting provider and they have given me conflicting information about whether memory resources exceeded the limit allowed in the shared hosting configuration.

I was able to find an issue in Scrapy that may be related: https://github.com/scrapy/scrapy/issues/4117

williamgunn commented 4 years ago

So it appears that it really was a problem with shared hosting, because when I run it on a VPS, it runs fine. For anyone that might be helped, I have a VPS at Dreamhost with 1 GB RAM, installed Python 3.7.1, and then had to manually install each requirement separately with pip3 before installing twint, to avoid an error about disk space.

pielco11 commented 4 years ago

Thank you for sharing your experience! I hope other users will not be in such situation, but if so, here's something that could help them!