src-d / vecino

Vecino is a command line application to discover Git repositories which are similar to the one that the user provides.
Other
46 stars 13 forks source link

Tried to use it, failed miserably #5

Closed campoy closed 6 years ago

campoy commented 6 years ago

Trying to run vecino https://github.com/apache/spark will not work.

$ vecino https://github.com/apache/spark
WARNING:bblfsh:Failed to ensure that the Babelfish server is running.
INFO:id2vec:Reading /Users/francesc/.source{d}/id2vec/default.asdf...
INFO:id2vec:Building the token index...
INFO:similar_repos:Loaded id2vec model: {'created_at': datetime.datetime(2017, 6, 18, 17, 37, 6, 255615),
 'dependencies': [],
 'model': 'id2vec',
 'uuid': '92609e70-f79c-46b5-8419-55726e873cfc',
 'version': [1, 0, 0]}
Shape: (999424, 300)
First 10 words: ['get', 'name', 'type', 'string', 'class', 'set', 'data', 'value', 'self', 'test']
INFO:docfreq:Reading /Users/francesc/.source{d}/docfreq/default.asdf...
INFO:docfreq:Building the docfreq dictionary...
INFO:docfreq:Pruning to min 20 occurrences
INFO:similar_repos:Loaded document frequencies: {'created_at': datetime.datetime(2017, 6, 19, 9, 59, 14, 766638),
 'dependencies': [],
 'model': 'docfreq',
 'uuid': 'f64bacd4-67fb-4c64-8382-399a8e7db52a',
 'version': [1, 0, 0]}
Number of words: 416370
First 10 words: ['aaa', 'aaaa', 'aaaaa', 'aaaaaa', 'aaaaaaa', 'aaaaaaaa', 'aaaaaaaaa', 'aaaaaaaaaa', 'aaaaaaaaaaa', 'aaaaaaaaaaaa']
Number of documents: 112273
INFO:nbow:Reading /Users/francesc/.source{d}/nbow/default.asdf...
INFO:nbow:Building the repository names mapping...
INFO:similar_repos:Loaded nBOW model: {'created_at': datetime.datetime(2017, 6, 19, 9, 16, 8, 942880),
 'dependencies': [{'created_at': datetime.datetime(2017, 6, 18, 17, 37, 6, 255615),
                   'dependencies': [],
                   'model': 'id2vec',
                   'uuid': '92609e70-f79c-46b5-8419-55726e873cfc',
                   'version': [1, 0, 0]},
                  {'created_at': datetime.datetime(2017, 6, 19, 9, 59, 14, 766638),
                   'dependencies': [],
                   'model': 'docfreq',
                   'uuid': 'f64bacd4-67fb-4c64-8382-399a8e7db52a',
                   'version': [1, 0, 0]}],
 'model': 'nbow',
 'uuid': '1e3da42a-28b6-4b33-94a2-a5671f4102f4',
 'version': [1, 0, 0]}
Shape: (112273, 999424)
First 10 repos: ['ikizir/HohhaDynamicXOR', 'ditesh/node-poplib', 'Code52/MarkPadRT', 'wp-shortcake/shortcake', 'capaj/Moonridge', 'HugoGiraudel/hugogiraudel.github.com', 'crosswalk-project/crosswalk-website', 'apache/parquet-mr', 'dciccale/kimbo.js', 'processone/oneteam']
Traceback (most recent call last):
  File "/usr/local/bin/vecino", line 11, in 
    sys.exit(main())
  File "/usr/local/lib/python3.6/site-packages/vecino/__main__.py", line 72, in main
    "vocabulary_max": args.vocabulary_max}
  File "/usr/local/lib/python3.6/site-packages/vecino/similar_repositories.py", line 53, in __init__
    assert self._nbow.get_dependency("id2vec")["uuid"] == self._id2vec.meta["uuid"]
AttributeError: 'NBOW' object has no attribute 'get_dependency'

Just in case it's useful, this is what my pip3 list returns:

$ pip3 list
appnope (0.1.0)
args (0.1.0)
asdf (1.3.1)
ast2vec (0.3.6a0)
astropy (2.0.2)
attrs (17.3.0)
bblfsh (2.6.1)
bleach (1.5.0)
cachetools (2.0.1)
certifi (2017.11.5)
chardet (3.0.4)
clint (0.5.1)
cycler (0.10.0)
decorator (4.1.2)
docker (2.6.1)
docker-pycreds (0.2.1)
enum34 (1.1.6)
google-api-core (0.1.1)
google-auth (1.2.1)
google-cloud-core (0.28.0)
google-cloud-storage (1.6.0)
google-resumable-media (0.3.1)
googleapis-common-protos (1.5.3)
grpcio (1.7.0)
grpcio-tools (1.7.0)
h5py (2.7.1)
html5lib (0.9999999)
idna (2.6)
ipykernel (4.6.1)
ipython (6.2.1)
ipython-genutils (0.2.0)
jedi (0.11.0)
jsonschema (2.6.0)
jupyter-client (5.1.0)
jupyter-core (4.4.0)
Keras (2.0.9)
lz4 (0.11.1)
Markdown (2.6.9)
matplotlib (2.1.0)
modelforge (0.3.1a0)
netifaces (0.10.6)
numpy (1.13.3)
olefile (0.44)
parso (0.1.0)
pexpect (4.3.0)
pickleshare (0.7.4)
Pillow (4.3.0)
pip (9.0.1)
pluggy (0.6.0)
prompt-toolkit (1.0.15)
protobuf (3.4.0)
ptyprocess (0.5.2)
py (1.5.2)
pyasn1 (0.4.2)
pyasn1-modules (0.2.1)
Pygments (2.2.0)
pyparsing (2.2.0)
PyStemmer (1.3.0)
pytest (3.3.0)
python-dateutil (2.6.1)
pytz (2017.3)
PyYAML (3.12)
pyzmq (16.0.3)
requests (2.18.4)
rsa (3.4.2)
scipy (0.19.1)
semantic-version (2.6.0)
setuptools (36.5.0)
simplegeneric (0.8.1)
six (1.11.0)
tensorflow (1.4.0)
tensorflow-tensorboard (0.4.0rc2)
tornado (4.5.2)
traitlets (4.3.2)
urllib3 (1.22)
vecino (0.1.5a0)
wcwidth (0.1.7)
websocket-client (0.44.0)
Werkzeug (0.12.2)
wheel (0.30.0)
wmd (1.2.6)
vmarkovtsev commented 6 years ago

Thanks! I haven't fixed the versions of dependencies properly, and the code expired. Should be working now.

campoy commented 6 years ago

Still not working, then I realized it required a local bblfsh server running. Is that the case? Does it need to be local and running on port 9432, or is there any way to tune this parameter?

vmarkovtsev commented 6 years ago

Right, using --bblfsh whatever:9432

campoy commented 6 years ago

Would it be worth creating an issue to improve the error message when the server is not available?