Closed roosyay closed 4 years ago
Hi,
We do not support Python 2.7 since it has passed its end of life. Could you try installing it with python > 3.5?
Also, please try installing all dependencies: python3 -m pip install -r requirements.txt
Using python 3.7.7 I get the output below. I tried installing freetype and png. But did not seem to get that to work. Trying to install the requirements.txt gives me the same error for matplotlib.
>pip install pyrdf2vec
Requirement already satisfied: pyrdf2vec in c:\users\roos\environments\embeddings37\lib\site-packages\pyrdf2vec-0.0.3-py3.7.egg (0.0.3)
Requirement already satisfied: gensim==3.5.0 in c:\users\roos\environments\embeddings37\lib\site-packages (from pyrdf2vec) (3.5.0)
Collecting matplotlib==2.1.1
Using cached matplotlib-2.1.1.tar.gz (36.1 MB)
ERROR: Command errored out with exit status 1:
command: 'C:\Users\Roos\Environments\Embeddings37\Scripts\python.exe' -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\Roos\\AppData\\Local\\Temp\\pip-install-vp1_dumf\\matplotlib\\setup.py'"'"'; __file__='"'"'C:\\Users\\Roos\\AppData\\Local\\Temp\\pip-install-vp1_dumf\\matplotlib\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base 'C:\Users\Roos\AppData\Local\Temp\pip-install-vp1_dumf\matplotlib\pip-egg-info'
cwd: C:\Users\Roos\AppData\Local\Temp\pip-install-vp1_dumf\matplotlib\
Complete output (70 lines):
============================================================================
Edit setup.cfg to change the build options
BUILDING MATPLOTLIB
matplotlib: yes [2.1.1]
python: yes [3.7.7 (tags/v3.7.7:d7c567b08f, Mar 10 2020,
10:41:24) [MSC v.1900 64 bit (AMD64)]]
platform: yes [win32]
REQUIRED DEPENDENCIES AND EXTENSIONS
numpy: yes [not found. pip may install it below.]
six: yes [using six version 1.14.0]
dateutil: yes [using dateutil version 2.8.1]
backports.functools_lru_cache: yes [Not required]
subprocess32: yes [Not required]
pytz: yes [using pytz version 2019.3]
cycler: yes [cycler was not found. pip/easy_install may
attempt to install it after matplotlib.]
tornado: yes [tornado was not found. It is required for the
WebAgg backend. pip/easy_install may attempt to
install it after matplotlib.]
pyparsing: yes [using pyparsing version 2.4.6]
libagg: yes [pkg-config information for 'libagg' could not
be found. Using local copy.]
freetype: no [The C/C++ header for freetype
(freetype2\ft2build.h) could not be found. You may
need to install the development package.]
png: no [The C/C++ header for png (png.h) could not be
found. You may need to install the development
package.]
qhull: yes [pkg-config information for 'libqhull' could not
be found. Using local copy.]
OPTIONAL SUBPACKAGES
sample_data: yes [installing]
toolkits: yes [installing]
tests: no [skipping due to configuration]
toolkits_tests: no [skipping due to configuration]
OPTIONAL BACKEND EXTENSIONS
macosx: no [Mac OS-X only]
qt5agg: no [PySide2 not found; PyQt5 not found]
qt4agg: no [PySide not found; PyQt4 not found]
gtk3agg: no [Requires pygobject to be installed.]
gtk3cairo: no [Requires cairocffi or pycairo to be installed.]
gtkagg: no [Requires pygtk]
tkagg: yes [installing; run-time loading from Python Tcl /
Tk]
wxagg: no [requires wxPython]
gtk: no [Requires pygtk]
agg: yes [installing]
cairo: no [cairocffi or pycairo not found]
windowing: yes [installing]
OPTIONAL LATEX DEPENDENCIES
dvipng: no
ghostscript: no
latex: no
pdftops: no
OPTIONAL PACKAGE DATA
dlls: no [skipping due to configuration]
============================================================================
* The following required packages can not be built:
* freetype, png * Please check http://gnuwin32.sourc
* eforge.net/packages/freetype.htm for instructions
* to install freetype * Please check http://gnuwin32
* .sourceforge.net/packages/libpng.htm for
* instructions to install png
----------------------------------------
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
Hmm it's been a while since I used a Windows machine... Perhaps this is related to https://github.com/pydicom/deid/issues/97 ?
Nevertheless, matplotlib is not really a strict dependency, and we only use it visualize a KG (which often is impossible because they are too large).
Perhaps you could try cloning the repo and only install the dependencies required for this minimal example (numpy, sklearn, pandas and rdflib):
import random
import os
import numpy as np
os.environ['PYTHONHASHSEED'] = '42'
random.seed(42)
np.random.seed(42)
import rdflib
import pandas as pd
from sklearn.model_selection import GridSearchCV
from sklearn.ensemble import RandomForestClassifier
from sklearn.svm import SVC
from sklearn.metrics import confusion_matrix, accuracy_score
from converters import rdflib_to_kg
from rdf2vec import RDF2VecTransformer
from walkers import RandomWalker
import warnings
warnings.filterwarnings('ignore')
# Load our train & test instances and labels
test_data = pd.read_csv('../data/MUTAG_test.tsv', sep='\t')
train_data = pd.read_csv('../data/MUTAG_train.tsv', sep='\t')
train_people = [rdflib.URIRef(x) for x in train_data['bond']]
train_labels = train_data['label_mutagenic']
test_people = [rdflib.URIRef(x) for x in test_data['bond']]
test_labels = test_data['label_mutagenic']
all_labels = list(train_labels) + list(test_labels)
# Define the label predicates, all triples with these predicates
# will be excluded from the graph
label_predicates = [
'http://dl-learner.org/carcinogenesis#isMutagenic'
]
# Convert the rdflib to our KnowledgeGraph object
kg = rdflib_to_kg('../data/mutag.owl', label_predicates=label_predicates)
random_walker = RandomWalker(2, float('inf'))
# Create embeddings with random walks
transformer = RDF2VecTransformer(walkers=[random_walker], sg=1)
walk_embeddings = transformer.fit_transform(kg, train_people + test_people)
# Fit model on the walk embeddings
train_embeddings = walk_embeddings[:len(train_people)]
test_embeddings = walk_embeddings[len(train_people):]
rf = RandomForestClassifier(random_state=42, n_estimators=100)
rf.fit(train_embeddings, train_labels)
print('Random Forest:')
print(accuracy_score(test_labels, rf.predict(test_embeddings)))
print(confusion_matrix(test_labels, rf.predict(test_embeddings)))
clf = GridSearchCV(SVC(random_state=42), {'kernel': ['linear', 'poly', 'rbf'], 'C': [10**i for i in range(-3, 4)]})
clf.fit(train_embeddings, train_labels)
print('Support Vector Machine:')
print(accuracy_score(test_labels, clf.predict(test_embeddings)))
print(confusion_matrix(test_labels, clf.predict(test_embeddings)))
After cloning the repo and installing the packages (numpy, sklearn, pandas and rdflib) I get the following ImportError trying to execute the above piece of code you gave me.
ImportError Traceback (most recent call last)
<ipython-input-4-c05727c713ee> in <module>
15 from sklearn.metrics import confusion_matrix, accuracy_score
16
---> 17 from converters import rdflib_to_kg
18 from rdf2vec import RDF2VecTransformer
19
ImportError: cannot import name 'rdflib_to_kg' from 'converters' (c:\users\roos\emb\lib\site-packages\converters\__init__.py)
You will have to make sure they are installed in your global path. For now, you could just run the example from the rdf2vec/
directory as the converters.py
and rdf2vec.py
are in there. I will (hopefully) be able to release a better update today.
Could you check if you still have issues with a clean install?
Works now! Great thank you :-)
Trying to install using python 3.7/3.8 gave me similar but more complex errors. So downgraded to try with python 2.7. Get this single error for scikit_learn (see below).