CenterForOpenScience / waterbutler

WaterButler is a Python web application for interacting with various file storage services via a single RESTful API, developed at Center for Open Science.
Apache License 2.0
62 stars 76 forks source link

504 Gateway Time-out when using osf-client and google drive #284

Closed luizirber closed 6 years ago

luizirber commented 7 years ago

Using the current master for osf-client:

pip install https://github.com/dib-lab/osf-cli/archive/cf525b42de3ef8ddef8cb4a4d4f5c826160f7a35.zip

When I try to list the contents of project vk4fa:

$ osf -p vk4fa list
googledrive/kraken_db/taxonomy/taxdump.tar.gz
googledrive/kraken_db/taxonomy/taxdump.flag
googledrive/kraken_db/taxonomy/taxdump.dlflag
googledrive/kraken_db/taxonomy/readme.txt
googledrive/kraken_db/taxonomy/nodes.dmp
googledrive/kraken_db/taxonomy/names.dmp
googledrive/kraken_db/taxonomy/merged.dmp
googledrive/kraken_db/taxonomy/gi_taxid_nucl.dmp
googledrive/kraken_db/taxonomy/gimap.flag
googledrive/kraken_db/taxonomy/gimap.dlflag
googledrive/kraken_db/taxonomy/gencode.dmp
googledrive/kraken_db/taxonomy/gc.prt
googledrive/kraken_db/taxonomy/division.dmp
googledrive/kraken_db/taxonomy/delnodes.dmp
googledrive/kraken_db/taxonomy/citations.dmp
googledrive/kraken_db/seqid2taxid.map
Traceback (most recent call last):
 File "bin/osf", line 11, in <module>
   load_entry_point('osfclient==0.0.3', 'console_scripts', 'osf')()
 File "lib/python3.6/site-packages/osfclient/__main__.py", line 104, in main
   exit_code = args.func(args)
 File "lib/python3.6/site-packages/osfclient/cli.py", line 91, in wrapper
   return_value = f(cli_args)
 File "lib/python3.6/site-packages/osfclient/cli.py", line 222, in list_
   for file_ in store.files:
 File "lib/python3.6/site-packages/osfclient/models/file.py", line 111, in _iter_children
   children.extend(self._follow_next(url))
 File "lib/python3.6/site-packages/osfclient/models/core.py", line 64, in _follow_next
   response = self._json(self._get(url), 200)
 File "lib/python3.6/site-packages/osfclient/models/core.py", line 60, in _json
   status_code))
RuntimeError: Response has status code 504 not (200,)

Running with PDB and digging into the Response, the URL failing is https://api.osf.io/v2/nodes/vk4fa/files/googledrive/kraken_db/library/Viruses/. We can reproduce the error with cURL at this point:

$ curl https://api.osf.io/v2/nodes/vk4fa/files/googledrive/kraken_db/library/Viruses/ 
<html>
<head><title>504 Gateway Time-out</title></head>
<body bgcolor="white">
<center><h1>504 Gateway Time-out</h1></center>
<hr><center>nginx</center>
</body>
</html>

If I go to https://osf.io/vk4fa/files/ and find the same path (googledrive/kraken_db/library/Viruses/) it loads the content of this folder.

From this point on I don't know how to debug: why does it work in the web frontend, but it fails when I query the API directly?

brianjgeiger commented 7 years ago

@luizirber Could you try connecting that to a project on test.osf.io to see if the problem replicates? That should let us know if it's something in the database or something with that directory.

sloria commented 7 years ago

@luizirber We've just deployed a fix that should prevent that timeout. Would you mind trying again?

felliott commented 6 years ago

@luizirber, did that fix work for you?

felliott commented 6 years ago

Closing as part of New Year's tidyness party! Feel free to reopen if the original issue has not been fixed.