Closed alphamupsiomega closed 8 years ago
I checked the instructions. The cd code/data
part seems spurious, as that's not the name of any directory involved; sorry about that. But I just ran git annex get
in the root of a fresh checkout and it worked. Could you try it there?
I'm confused that you get no output from git annex get
. Does git annex init
do anything?
There isn't really an alternative to git-annex at the moment; I wanted to make sure to version-control the files so that someone could get a precise version of the data that goes with a precise version of the code.
But I have concluded that git-annex is a pain, especially for public consumption, so you might be happy to know that I've avoided git-annex and used a much more straightforward downloading scheme as I work on merging this code into the next version of the ConceptNet code.
(python3env)MYMBP:source-data MY$ git init
Reinitialized existing Git repository in /Users/MY/conceptnet-numberbatch/code/source-data/.git/
(python3env)MYMBP:source-data MY$ git annex init
init ok
(recording state in git...)
(python3env)MYMBP:source-data MY$ git annex get
(python3env)MYMBP:source-data MY$
I've tried this in other directories, and there is no difference. I assume you have the actual data files somewhere--can you find another way to share these files?
I'll probably be working on that, but I'd like to understand what went wrong with the instructions here, for the benefit of others.
Why is /Users/MY/conceptnet-numberbatch/code/source-data/.git/
its own git repository? Did something in the instructions lead to that, or did you previously run git init
there as well? That would presumably be the problem. Your source-data
directory is an empty repository, and that's why git-annex
has nothing to get.
The git repository you should be using is the one in the conceptnet-numberbatch
directory. There shouldn't be any sub-repositories involved. I recommend removing the source-data/.git
directory, not running git init
(that's way different from git annex init
), and trying again.
I did that as well as there is no difference.
(python3env)MYMBP:conceptnet-numberbatch MY$ git annex get
git-annex: Not in a git repository.
(python3env)MYMBP:conceptnet-numberbatch MY$ git init
Initialized empty Git repository in /Users/MY/xNLP/conceptnet-numberbatch/.git/
(python3env)MYMBP:conceptnet-numberbatch MY$ git annex get
git-annex: First run: git-annex init
(python3env)MYMBP:conceptnet-numberbatch MY$ git-annex init
init ok
(recording state in git...)
(python3env)MYMBP:conceptnet-numberbatch MY$ git annex get
(python3env)MYMBP:conceptnet-numberbatch MY$
You should stop running git init
. It's creating empty Git repositories.
You didn't tell me that you got "git-annex: Not in a git repository" before. This is what you need to fix. You need to be in a git repository. Not an empty one that you just made. You need to be in the conceptnet-numberbatch
repository, the one that you presumably cloned at some point and that you're reporting an issue on.
Meanwhile, I've run the current directions from scratch and confirmed that they work (all I had to fix was the thing about code/data
). I think you've messed up your repository by running extra commands, but if you start over, it should work.
How do I get "in" a git repository with git annex? I've not used git annex before, and its walkthrough does not explain differences between git init and git annex init. The walkthrough just runs each one one after the other so it's not clear what each does. Would you mind please writing out the exact code line by line I need to type? Per your instructions I've already deleted the empty Git repositories.
git-annex is a tool that manages large files in a Git repository. You use it as part of a Git repository. There isn't a separate idea of a "git annex repository", it's just extra data in a Git repository. conceptnet-numberbatch has this extra data, as a way of having large, version-controlled data files.
git init
creates a new Git repository. git annex init
sets up an existing Git repository to use git-annex if it's not already. You saw these in the git-annex walkthrough because it's walking you through how to start a new project. My directions told you to read the walkthrough, because git-annex is confusing, and thanks for reading it, but if you follow the walkthrough's directions you're going to make a new project. You should follow Numberbatch's directions, not the walkthrough's directions, to get Numberbatch.
Here's what you'll need to run. First clone this repository and go to its directory:
git clone https://github.com/LuminosoInsight/conceptnet-numberbatch
cd conceptnet-numberbatch
Then run the directions in the README:
cd code
python setup.py develop
git annex get
cd ..
python ninja.py
ninja
Got it, git annex get works now. Thanks for your help. Since the cd/data directory did not work initially, I thought git annex had to initialized separate from your instructions.
After everything else is successful, including a Git Annex installation, when I type:
Nothing occurs. No downloads start. I tried this with cd code/source-data, and it does not work either. How can I download the data files?
Alternatively, is there a way to obtain the files without git annex?