dragnet-org / dragnet

Just the facts -- web page content extraction
MIT License
1.26k stars 180 forks source link

[Error] ValueError: Can't find libxml2 include headers #12

Closed IndianShifu closed 9 years ago

IndianShifu commented 9 years ago

Hey, I am working on my undergraduate project and was trying to use dragnet but I am unable to do so after repeated attempts. I have already installed the required dependencies but I am still getting the "ValueError: Can't find libxml2 include headers" error (Below is the log file ).Please help.Thanks in advance.


c:\Python27\Scripts\pip run on 01/29/15 17:35:16 Downloading/unpacking dragnet Getting page https://pypi.python.org/simple/dragnet/ URLs to search for versions for dragnet:

ValueError: Can't find libxml2 include headers


Cleaning up... Removing temporary dir c:\users\faisal~1\appdata\local\temp\pip_build_FAISAL KHAN... Command python setup.py egg_info failed with error code 1 in c:\users\faisal~1\appdata\local\temp\pip_build_FAISAL KHAN\dragnet Exception information: Traceback (most recent call last): File "C:\Python27\lib\site-packages\pip\basecommand.py", line 122, in main status = self.run(options, args) File "C:\Python27\lib\site-packages\pip\commands\install.py", line 278, in run requirement_set.prepare_files(finder, force_root_egg_info=self.bundle, bundle=self.bundle) File "C:\Python27\lib\site-packages\pip\req.py", line 1229, in prepare_files req_to_install.run_egg_info() File "C:\Python27\lib\site-packages\pip\req.py", line 325, in run_egg_info command_desc='python setup.py egg_info') File "C:\Python27\lib\site-packages\pip\util.py", line 697, in call_subprocess % (command_desc, proc.returncode, cwd)) InstallationError: Command python setup.py egg_info failed with error code 1 in c:\users\faisal~1\appdata\local\temp\pip_build_FAISAL KHAN\dragnet

matt-peters commented 9 years ago

Do you have libxml2 and lxml installed? Try python -c "import lxml" and see if it works.

If not, then first install libxml2 and try rebuilding. It looks like you are on Windows so this thread may be useful:

http://stackoverflow.com/questions/1904752/install-libxml2-and-associated-python-bindings-windows

(I don't have access to a Windows machine so unfortunately can't test it out myself).

IndianShifu commented 9 years ago

Thanks for replying,

I have libxml2 and lxml but still the same error is coming.Can you still help? ( I am on windows 7)

"find_libxml2_include()" in setup.py seems to be causing problems

Is it because of the path you have given.Is there a need to change it according to Windows 7

matt-peters commented 9 years ago

Ah yes, I see. I pushed a commit 688ac7d71434c408dc227dd1357ec8c09c8630fa to the windows_build branch that should fix this. Check out that branch and give it a try.

IndianShifu commented 9 years ago

This error is no more coming now :+1:

However, a new error seems to have cropped

dragnet\blocks.cpp(246) : fatal error C1083: Cannot open include file: 'stdint.h ': No such file or directory error: command 'C:\Users\FAISAL KHAN\AppData\Local\Programs\Common\Micros oft\Visual C++ for Python\9.0\VC\Bin\cl.exe' failed with exit status 2"


Take a look at this - http://stackoverflow.com/questions/12970293/why-microsoft-visual-studio-cannot-find-stdint-h

matt-peters commented 9 years ago

Well that's unfortunate. At this point I'm afraid I won't be of much help, having done all my development on Linux-like operating systems and never in Windows. I don't even have access to a Windows machine to test things... I'd recommend forking the repo and modifying the code to get a build working on your machine. If you do and it doesn't break the Travis build then open a PR and we'll review it to merge.

IndianShifu commented 9 years ago

Will do that soon Thanks

b4hand commented 9 years ago

What version of Visual Studio are you using? Recent versions of VS should have stdint.h. That stackoverflow link you gave is from 2012.

b4hand commented 9 years ago

Specifically, see here for where it can be found, and that's from 2010:

http://choorucode.com/2010/04/13/visual-studio-2010-stdint-h/

IndianShifu commented 9 years ago

I am using eclipse Pydev. I have VS 9 installed on my system though. Moreover, I also tested the same code on my Virtual Machine - Ubuntu and surprisingly the following error came


faisalkhan@faisalkhan-VirtualBox:~$ pip install dragnet Downloading/unpacking dragnet Downloading dragnet-1.0.1.tar.gz (923kB): 923kB downloaded Running setup.py (path:/tmp/pip_build_faisalkhan/dragnet/setup.py) egg_info for package dragnet Traceback (most recent call last): File "", line 17, in File "/tmp/pip_build_faisalkhan/dragnet/setup.py", line 50, in include_dirs = lxml.get_include() + [find_libxml2_include()], File "/tmp/pip_build_faisalkhan/dragnet/setup.py", line 36, in find_libxml2_include raise ValueError("Can't find libxml2 include headers") ValueError: Can't find libxml2 include headers Complete output from command python setup.py egg_info: Traceback (most recent call last):

File "", line 17, in

File "/tmp/pip_build_faisalkhan/dragnet/setup.py", line 50, in

include_dirs = lxml.get_include() + [find_libxml2_include()],

File "/tmp/pip_build_faisalkhan/dragnet/setup.py", line 36, in find_libxml2_include

raise ValueError("Can't find libxml2 include headers")

ValueError: Can't find libxml2 include headers


Cleaning up... Command python setup.py egg_info failed with error code 1 in /tmp/pip_build_faisalkhan/dragnet Storing debug log for failure in /home/faisalkhan/.pip/pip.log

b4hand commented 9 years ago

That's an entirely different issue. Installing on ubuntu requires the development package for libxml2 called "libxml2-dev". Unfortunately, Python has no built-in way to force installation of prerequisite dependencies from C extensions.

Also, install a newer version of Visual Studio. VS9 is from 2005. That's nearly ten years old at this point. You're not going to get much support from anyone to help you make things work on that old of a development environment. New versions of Visual Studio Express are available for download for free, and they are C99 compliant, so they will include stdint.h.

IndianShifu commented 9 years ago

Thanks for the help