liberapay / git-lfs-fetch.py

Lightweight Git Large File Storage fetcher written in python
31 stars 13 forks source link

failing fetch from https server with unknown ssl certificate #3

Open volviq opened 7 years ago

volviq commented 7 years ago

I'm trying to fetch the LFS files with: python -m git_lfs -vv

but then I get:

Fetching URLs from https://git.example.loc/~username/repo_git_lfs.git/info/lfs...
Traceback (most recent call last):
  File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/usr/local/lib/python2.7/dist-packages/git_lfs/__main__.py", line 14, in <module>
    fetch(args.git_repo, args.checkout_dir, args.verbose)
  File "/usr/local/lib/python2.7/dist-packages/git_lfs/__init__.py", line 164, in fetch
    objects = fetch_urls(lfs_url, oid_list)
  File "/usr/local/lib/python2.7/dist-packages/git_lfs/__init__.py", line 103, in fetch_urls
    resp = json.loads(urlopen(req).read().decode('ascii'))
  File "/usr/lib/python2.7/urllib2.py", line 127, in urlopen
    return _opener.open(url, data, timeout)
  File "/usr/lib/python2.7/urllib2.py", line 404, in open
    response = self._open(req, data)
  File "/usr/lib/python2.7/urllib2.py", line 422, in _open
    '_open', req)
  File "/usr/lib/python2.7/urllib2.py", line 382, in _call_chain
    result = func(*args)
  File "/usr/lib/python2.7/urllib2.py", line 1222, in https_open
    return self.do_open(httplib.HTTPSConnection, req)
  File "/usr/lib/python2.7/urllib2.py", line 1184, in do_open
    raise URLError(err)
urllib2.URLError: <urlopen error [Errno 111] Connection refused>

For git via https I have to set

[http]                                                                                                                  
    sslVerify = false

to make clones and pulls work. It could be, that the reason of the problem is the same, but git_lfs doesn't recognize this setting.

Changaco commented 7 years ago

That doesn't look like an SSL error. I'm pretty sure urllib2 doesn't even try to verify certificates by default.

Changaco commented 7 years ago

The computed URL is probably not the right one. You can set a custom one in a .lfsconfig file, like this: https://github.com/liberapay/liberapay.com/blob/163/.lfsconfig

volviq commented 7 years ago

I think that goes in the right direction. I looked into the bitbucket url and modified the .lfsconfig file (see below). Now I get instead:

python -m git_lfs -vv
Fetching URLs from https://git.example.loc:3333/rest/git-lfs/storage/~USERNAME/repo_git_lfs.git/info/lfs...
Traceback (most recent call last):
  File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/usr/local/lib/python2.7/dist-packages/git_lfs/__main__.py", line 14, in <module>
    fetch(args.git_repo, args.checkout_dir, args.verbose)
  File "/usr/local/lib/python2.7/dist-packages/git_lfs/__init__.py", line 164, in fetch
    objects = fetch_urls(lfs_url, oid_list)
  File "/usr/local/lib/python2.7/dist-packages/git_lfs/__init__.py", line 103, in fetch_urls
    resp = json.loads(urlopen(req).read().decode('ascii'))
  File "/usr/lib/python2.7/urllib2.py", line 127, in urlopen
    return _opener.open(url, data, timeout)
  File "/usr/lib/python2.7/urllib2.py", line 410, in open
    response = meth(req, response)
  File "/usr/lib/python2.7/urllib2.py", line 523, in http_response
    'http', request, response, code, msg, hdrs)
  File "/usr/lib/python2.7/urllib2.py", line 448, in error
    return self._call_chain(*args)
  File "/usr/lib/python2.7/urllib2.py", line 382, in _call_chain
    result = func(*args)
  File "/usr/lib/python2.7/urllib2.py", line 531, in http_error_default
    raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 500: Internal Server Error

I know that the final url looks like:

https://git.example.loc:3333/rest/git-lfs/storage/~USERNAME/repo_git_lfs/8e3b6c193f91dc94b2f3b0261e3eabbdc604f78ff99fdad324a56fdd0b5e958c?response-content-disposition=attachment%3B%20filename%3D%22ecdsa-0.11.tar.gz%22%3B%20filename*%3Dutf-8%27%27ecdsa-0.11.tar.gz

So it would be good if I could print the url which the script attempts to get, to then find out what I need to set in .lsfconfig. Currently I have set it to:

[lfs]                                                                                                                   
    url = https://git.example.loc:3333/rest/git-lfs/storage/~USERNAME/repo_git_lfs/

Note: the git repo address contains lower case: "username" whereas the url for LFS has capital letters: "USERNAME" - the "standard" git lfs client resolves this url correctly. I don't know anything about the internals, so I don't know how this is done correctly.

Changaco commented 7 years ago

The script is showing you the URL it's attempting to get, it's https://git.example.loc:3333/rest/git-lfs/storage/~USERNAME/repo_git_lfs.git/info/lfs.

Changaco commented 7 years ago

You can find the right URL through the official client: git lfs env.

volviq commented 7 years ago

git lfs env gives me: Endpoint=https://git.example.loc/~username/repo_git_lfs.git/info/lfs (auth=none)

I assume that's what I should look for? I set now my .lfsconfig to:

    url = https://git.example.loc/~username/repo_git_lfs.git/info/lfs

and getting back the same error as initially with:

python -m git_lfs -vv
Fetching URLs from https://git.example.loc/~username/main_git_lfs.git/info/lfs...
Traceback (most recent call last):
  File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/usr/local/lib/python2.7/dist-packages/git_lfs/__main__.py", line 14, in <module>
    fetch(args.git_repo, args.checkout_dir, args.verbose)
  File "/usr/local/lib/python2.7/dist-packages/git_lfs/__init__.py", line 164, in fetch
    objects = fetch_urls(lfs_url, oid_list)
  File "/usr/local/lib/python2.7/dist-packages/git_lfs/__init__.py", line 103, in fetch_urls
    resp = json.loads(urlopen(req).read().decode('ascii'))
  File "/usr/lib/python2.7/urllib2.py", line 127, in urlopen
    return _opener.open(url, data, timeout)
  File "/usr/lib/python2.7/urllib2.py", line 404, in open
    response = self._open(req, data)
  File "/usr/lib/python2.7/urllib2.py", line 422, in _open
    '_open', req)
  File "/usr/lib/python2.7/urllib2.py", line 382, in _call_chain
    result = func(*args)
  File "/usr/lib/python2.7/urllib2.py", line 1222, in https_open
    return self.do_open(httplib.HTTPSConnection, req)
  File "/usr/lib/python2.7/urllib2.py", line 1184, in do_open
    raise URLError(err)
urllib2.URLError: <urlopen error [Errno 111] Connection refused>

I would really like to isolate the problem more or add more debug output, but I don't know how to get more out of this.

If what I have set in .lfsconfig was correct, the same fetch address (https://git.example.loc/~username/repo_git_lfs.git/info/lfs...) shows that git_lfs picked the right url automatically in the first place. The url I tried before (https://git.example.loc:3333/rest/git-lfs/storage/~USERNAME/repo_git_lfs/) was based on bitbuckets webinterface - where I could directly download the files.

Changaco commented 7 years ago

What does curl -v 'https://git.example.loc/~username/repo_git_lfs.git/info/lfs' give you? Does opening that same URL in a web browser work?

volviq commented 7 years ago
curl -v 'https://git.example.loc/~username/repo_git_lfs.git/info/lfs'
*   Trying 172.28.8.123...
* connect to 172.28.8.123 port 443 failed: Connection refused
* Failed to connect to git.ssdis.loc port 443: Connection refused
* Closing connection 0
curl: (7) Failed to connect to git.ssdis.loc port 443: Connection refused

--> manually adding the port changes this a little:

curl -v 'https://git.example.loc:3333/~username/repo_git_lfs.git/info/lfs'
*   Trying 172.28.8.123...
* Connected to git.ssdis.loc (172.28.8.123) port 7999 (#0)
* Initializing NSS with certpath: sql:/etc/pki/nssdb
*   CAfile: /etc/pki/tls/certs/ca-bundle.crt
  CApath: none
* NSS error -5938 (PR_END_OF_FILE_ERROR)
* Encountered end of file
* Closing connection 0
curl: (35) Encountered end of file

With chrome I get:

This site can’t provide a secure connection

git.example.loc sent an invalid response.
ERR_SSL_PROTOCOL_ERROR

Firefox gives me:

Secure Connection Failed

The connection to git.example.loc:3333 was interrupted while the page was loading.

    The page you are trying to view cannot be shown because the authenticity of the received data could not be verified.
    Please contact the web site owners to inform them of this problem.

Lynx gives me:

Alert!: Unable to make secure connection to remote host.
lynx: Can't access startfile 

Interestingly bitbuckets webinterface is on another port than the git url. and the bitbucket url works fine (apart from the need to ignore the ssl certificat warnings)

And - using the normal git lfs client just works...

Changaco commented 7 years ago

Is your repository private? The output of git lfs env you've posted suggests that it isn't (auth=none). The git_lfs python module only supports fetching public files.

volviq commented 7 years ago

The repository is not private - it is public:

Public access
Allow users without a Bitbucket account to clone and browse this repository.

-> is enabled in the Bitbucket UI. Although it may be a requirement to work in long term I believe. But I'd rather solve the current problem first...

I'll prepare an output of GIT_TRACE=true git lfs clone, maybe that helps?

Changaco commented 7 years ago

Since the repo is public can you give me its real URL so I can try things myself?

volviq commented 7 years ago

Since the repo is public can you give me its real URL so I can try things myself?

Sorry, It is only public in an internal network. I'm happy to run more debugging.