commonsense / conceptnet-numberbatch

Other
1.29k stars 143 forks source link

Accessing the ensembled vectors #43

Closed meghmehta closed 7 years ago

meghmehta commented 7 years ago

after running "ninja", where can I find the ensemble vectors?

rspeer commented 7 years ago

In this build process, the labels will appear in build-data/combo840.standardized.conceptnet5.labels and the vectors in build-data/combo840.l1.standardized.conceptnet5.npy. The filenames are elaborate because this repository builds many different variations of the system to compare in an evaluation.

There is also a new build process that's part of the conceptnet5 repository, which will give you more up-to-date vectors in data/vectors/numberbatch.h5, as well as the tools you need to look up ConceptNet nodes in that vector space.

meghmehta commented 7 years ago

Thank you - my problem is that I can't find any build-data folder in the repository and so I am not sure where to find these files... would you be able to guid me in this?

rspeer commented 7 years ago

That's where the files created by ninja go.

What happened when you ran ninja in the code/ directory? Did it crash?

meghmehta commented 7 years ago

so I've been running the 16.04 branch....it basically runs through build_conceptnet_retrofitting() which is called in ninja.py. But there's only a source_data folder, I can't find a build_data folder...

rspeer commented 7 years ago

Oh. You haven't run the build yet.

Running ninja.py outputs a file called build.ninja, which contains instructions to the Ninja build system (https://ninja-build.org/) for how to build ConceptNet Numberbatch. After that, you run ninja to actually run the build.

meghmehta commented 7 years ago

Yes I did run ninja as well... like i ran ninja.py and then I ran ninja but I don't get any new outputs...

meghmehta commented 7 years ago

I actually get the error: Importerror: cannot import ninja from ninja

rspeer commented 7 years ago

Can you copy and paste the actual output, please?

rspeer commented 7 years ago

What's particularly peculiar about saying you got "Importerror: cannot import ninja from ninja" is that it looks almost like a Python error, and this makes me doubt you're actually running Ninja, which is written in C++.

meghmehta commented 7 years ago

I actually don't have that error anymore...when I run ninja it gives the following output: log: warning: no configuration file specified, using default values log: ninja version 0.1.3 initializing log: magic group: gid=0 (wheel) log: entering main loop log: generating initial pid array.. log: now monitoring process activity

How long would it take for it to complete? It's been running for > 1 hour now.

rspeer commented 7 years ago

It should take a day or so. Keep in mind that this process runs all the conditions of the experiment.

meghmehta commented 7 years ago

okay thank you!

meghmehta commented 7 years ago

Is it GPU enabled?

rspeer commented 7 years ago

Man, how difficult to set up do you want it to be?

This task is memory-constrained, so I think it would actually run much worse on a GPU.