Open Tsuser1 opened 6 years ago
Additional stack traces: (Work unit newsbuddy:warrior_7_1532031940.55)
ERROR Fatal exception.
Traceback (most recent call last):
File "/home/box/.local/lib/python3.4/site-packages/dns/rdata.py", line 389, in get_rdata_class
File "/home/box/.local/lib/python3.4/site-packages/dns/rdata.py", line 374, in import_module
ImportError: No module named 'dns.rdtypes.IN.CNAME'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/box/wpull/freezer/pyinstaller/wpull_env/lib/python3.4/site-packages/wpull/app.py", line 128, in run
File "/home/box/.local/lib/python3.4/site-packages/trollius/tasks.py", line 251, in _step
File "/home/box/wpull/freezer/pyinstaller/wpull_env/lib/python3.4/site-packages/wpull/engine.py", line 281, in __call__
File "/home/box/.local/lib/python3.4/site-packages/trollius/tasks.py", line 253, in _step
File "/home/box/wpull/freezer/pyinstaller/wpull_env/lib/python3.4/site-packages/wpull/engine.py", line 70, in _run_workers
File "/home/box/.local/lib/python3.4/site-packages/trollius/futures.py", line 287, in result
File "/home/box/.local/lib/python3.4/site-packages/trollius/tasks.py", line 251, in _step
File "/home/box/wpull/freezer/pyinstaller/wpull_env/lib/python3.4/site-packages/wpull/engine.py", line 149, in _run_worker
File "/home/box/.local/lib/python3.4/site-packages/trollius/tasks.py", line 251, in _step
File "/home/box/wpull/freezer/pyinstaller/wpull_env/lib/python3.4/site-packages/wpull/engine.py", line 330, in _process_item
File "/home/box/.local/lib/python3.4/site-packages/trollius/tasks.py", line 251, in _step
File "/home/box/wpull/freezer/pyinstaller/wpull_env/lib/python3.4/site-packages/wpull/engine.py", line 387, in _process_url_item
File "/home/box/.local/lib/python3.4/site-packages/trollius/tasks.py", line 251, in _step
File "/home/box/wpull/freezer/pyinstaller/wpull_env/lib/python3.4/site-packages/wpull/processor/delegate.py", line 27, in process
File "/home/box/.local/lib/python3.4/site-packages/trollius/tasks.py", line 251, in _step
File "/home/box/wpull/freezer/pyinstaller/wpull_env/lib/python3.4/site-packages/wpull/processor/web.py", line 123, in process
File "/home/box/.local/lib/python3.4/site-packages/trollius/tasks.py", line 251, in _step
File "/home/box/wpull/freezer/pyinstaller/wpull_env/lib/python3.4/site-packages/wpull/processor/web.py", line 215, in process
File "/home/box/.local/lib/python3.4/site-packages/trollius/tasks.py", line 251, in _step
File "/home/box/wpull/freezer/pyinstaller/wpull_env/lib/python3.4/site-packages/wpull/processor/web.py", line 274, in _process_loop
File "/home/box/.local/lib/python3.4/site-packages/trollius/tasks.py", line 251, in _step
File "/home/box/wpull/freezer/pyinstaller/wpull_env/lib/python3.4/site-packages/wpull/processor/web.py", line 319, in _fetch_one
File "/home/box/.local/lib/python3.4/site-packages/trollius/tasks.py", line 251, in _step
File "/home/box/wpull/freezer/pyinstaller/wpull_env/lib/python3.4/site-packages/wpull/http/web.py", line 167, in fetch
File "/home/box/.local/lib/python3.4/site-packages/trollius/tasks.py", line 251, in _step
File "/home/box/wpull/freezer/pyinstaller/wpull_env/lib/python3.4/site-packages/wpull/http/client.py", line 70, in fetch
File "/home/box/.local/lib/python3.4/site-packages/trollius/tasks.py", line 251, in _step
File "/home/box/wpull/freezer/pyinstaller/wpull_env/lib/python3.4/site-packages/wpull/http/stream.py", line 445, in reconnect
File "/home/box/.local/lib/python3.4/site-packages/trollius/tasks.py", line 251, in _step
File "/home/box/wpull/freezer/pyinstaller/wpull_env/lib/python3.4/site-packages/wpull/connection.py", line 824, in connect
File "/home/box/.local/lib/python3.4/site-packages/trollius/tasks.py", line 251, in _step
File "/home/box/wpull/freezer/pyinstaller/wpull_env/lib/python3.4/site-packages/wpull/dns.py", line 143, in resolve_dual
File "/home/box/.local/lib/python3.4/site-packages/trollius/tasks.py", line 251, in _step
File "/home/box/wpull/freezer/pyinstaller/wpull_env/lib/python3.4/site-packages/wpull/dns.py", line 87, in resolve_all
File "/home/box/.local/lib/python3.4/site-packages/trollius/tasks.py", line 251, in _step
File "/home/box/wpull/freezer/pyinstaller/wpull_env/lib/python3.4/site-packages/wpull/dns.py", line 197, in _resolve_from_network
File "/home/box/.local/lib/python3.4/site-packages/trollius/tasks.py", line 255, in _step
File "/home/box/.local/lib/python3.4/site-packages/trollius/tasks.py", line 424, in wait_for
File "/home/box/.local/lib/python3.4/site-packages/trollius/futures.py", line 287, in result
File "/home/box/.local/lib/python3.4/site-packages/trollius/tasks.py", line 251, in _step
File "/home/box/wpull/freezer/pyinstaller/wpull_env/lib/python3.4/site-packages/wpull/dns.py", line 339, in _getaddrinfo_implementation
File "/home/box/.local/lib/python3.4/site-packages/trollius/tasks.py", line 251, in _step
File "/home/box/wpull/freezer/pyinstaller/wpull_env/lib/python3.4/site-packages/wpull/dns.py", line 306, in query_ipv4
File "/usr/local/lib/python3.4/concurrent/futures/thread.py", line 54, in run
File "/home/box/wpull/freezer/pyinstaller/wpull_env/lib/python3.4/site-packages/wpull/dns.py", line 352, in _query
File "/home/box/.local/lib/python3.4/site-packages/dns/resolver.py", line 834, in query
File "/home/box/.local/lib/python3.4/site-packages/dns/query.py", line 230, in udp
File "/home/box/.local/lib/python3.4/site-packages/dns/message.py", line 791, in from_wire
File "/home/box/.local/lib/python3.4/site-packages/dns/message.py", line 730, in read
File "/home/box/.local/lib/python3.4/site-packages/dns/message.py", line 704, in _get_section
File "/home/box/.local/lib/python3.4/site-packages/dns/rdata.py", line 476, in from_wire
File "/home/box/.local/lib/python3.4/site-packages/dns/rdata.py", line 394, in get_rdata_class
File "/home/box/.local/lib/python3.4/site-packages/dns/rdata.py", line 377, in import_module
AttributeError: 'module' object has no attribute 'CNAME'
CRITICAL Sorry, Wpull unexpectedly crashed.
CRITICAL Please report this problem to the authors at Wpull's issue tracker so it may be fixed. If you know how to program, maybe help us fix it? Thank you for helping us help you help us all.
See #322 and #323. I have no idea what could be causing this besides a botched dnspython installation. Or maybe the binary that NewsGrabber is using is somehow broken. I know there have been various issues with it under discussion in #newsgrabber
before.
I'll attempt reinstalling dnspython
so see if the issue is resolved.
Interestingly, when I executed pip3 uninstall dnspython
, I noticed this segment in the console output:
...
/usr/local/lib/python3.5/dist-packages/dns/rdtypes/IN/A.py
/usr/local/lib/python3.5/dist-packages/dns/rdtypes/IN/AAAA.py
/usr/local/lib/python3.5/dist-packages/dns/rdtypes/IN/APL.py
/usr/local/lib/python3.5/dist-packages/dns/rdtypes/IN/DHCID.py
/usr/local/lib/python3.5/dist-packages/dns/rdtypes/IN/IPSECKEY.py
/usr/local/lib/python3.5/dist-packages/dns/rdtypes/IN/KX.py
/usr/local/lib/python3.5/dist-packages/dns/rdtypes/IN/NAPTR.py
/usr/local/lib/python3.5/dist-packages/dns/rdtypes/IN/NSAP.py
/usr/local/lib/python3.5/dist-packages/dns/rdtypes/IN/NSAP_PTR.py
/usr/local/lib/python3.5/dist-packages/dns/rdtypes/IN/PX.py
/usr/local/lib/python3.5/dist-packages/dns/rdtypes/IN/SRV.py
/usr/local/lib/python3.5/dist-packages/dns/rdtypes/IN/WKS.py
...
So the files were definitely there.
That's Python 3.5 though. Your tracebacks above used Python 3.4.
Is the ArchiveTeam Warrior using an internal version of Python for execution?
I'm not sure what the warrior VM is doing exactly. However, I think this is specific to NewsGrabber since this is the only project using wpull. And it's probably related to that wpull binary. Note the paths in the traceback, /home/box/wpull/freezer/pyinstaller/...
, which don't actually exist (on my machine, anyway). Perhaps the dns.rdtypes
package was not included in the binary.
You might be able to work around it by using pip3.4 install dnspython
, assuming you have a Python 3.4 installation on the machine. But something's very broken with that binary, and really such a binary shouldn't be necessary in the first place. (I believe the reason why it exists is that NewsGrabber requires Python 2 still, so this is some workaround to run wpull from Python 2. There has to be a better way though.)
I actually just tried pip3.4 install dnspython
, all I received was a command not found error. I looked around and I couldn't find any traces of this ghost python installation it has conjured up, so I agree with the statement it is some sort of interesting implementation.
However, I have not seen the error occur in the past 15 minutes since using pip3 uninstall dnspython && pip3 install dnspython
(reinstalling it). So, hopefully this preliminary conclusion holds true over time.
Yeah, I believe that binary is actually a bundle of Python 3.4 and all the necessary packages. With that, it's possible to run wpull even on a machine that doesn't have any Python 3 installation. It probably falls back to system-installed packages when it doesn't have them inside the binary or something like that. The proper solution would be porting ArchiveTeam/NewsGrabber-Warrior to Python 3 so this mess is no longer necessary. Or somehow executing wpull from inside Python 2 without this weird binary, which also has to be possible somehow.
What I wanted: Web crawling to work in an expected and normal manner
What I expect: Normal web crawling
What happened: DNS missing module errors.
The command or website causes the problem: NewsGrabber Warrior
Operating system: Debian (Custom docker image)
Python version: Python 3.4
Wpull version: wpull-1.2.3-linux-x86_64-3.4.3-20160302011013
Log/Output: