EOL / tramea

A lightweight server for denormalized EOL data
Other
2 stars 1 forks source link

opendata downloads are timing out? #366

Closed jhammock closed 7 years ago

jhammock commented 7 years ago

It appears that downloads from opendata (on the Archive machine) are consistently timing out before files can be downloaded. @JRice , @eliagbayani , is this an issue for SI sysadmin?

From @jhpoelen:

To reduce load on EOL api to resolve taxon names, common names and key images, the eol team is working on providing a taxon hierarchy archive.

Today, I unsuccessfully attempted to download EOL taxon hierarchies from http://opendata.eol.org using wget:

wget http://opendata.eol.org/hierarchy_entries.tgz --2017-01-12 15:59:55-- http://opendata.eol.org/hierarchy_entries.tgz Resolving opendata.eol.org (opendata.eol.org)... 160.111.248.28 Connecting to opendata.eol.org (opendata.eol.org)|160.111.248.28|:80... connected. HTTP request sent, awaiting response... 503 Service Unavailable 2017-01-12 15:59:57 ERROR 503: Service Unavailable.

JRice commented 7 years ago

Yeesh. It's not a machine that I set up (Dima did it), but I can say that it's a pretty simple affair, running nginx (only).

Is there are simple nginx setting that will increase the timeout? Apologies, nginx isn't my area of expertise. (It should be, but I've been too lazy to learn it, sigh.)

On Fri, Jan 13, 2017 at 10:49 AM, Jen Hammock notifications@github.com wrote:

It appears that downloads from opendata (on the Archive machine) are consistently timing out before files can be downloaded. @JRice https://github.com/JRice , @eliagbayani https://github.com/eliagbayani , is this an issue for SI sysadmin?

From @jhpoelen https://github.com/jhpoelen:

To reduce load on EOL api to resolve taxon names, common names and key images, the eol team is working on providing a taxon hierarchy archive.

Today, I unsuccessfully attempted to download EOL taxon hierarchies from http://opendata.eol.org using wget:

wget http://opendata.eol.org/hierarchy_entries.tgz --2017-01-12 15:59:55-- http://opendata.eol.org/hierarchy_entries.tgz Resolving opendata.eol.org (opendata.eol.org)... 160.111.248.28 Connecting to opendata.eol.org (opendata.eol.org)|160.111.248.28|:80... connected. HTTP request sent, awaiting response... 503 Service Unavailable 2017-01-12 15:59:57 ERROR 503: Service Unavailable.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/EOL/tramea/issues/366, or mute the thread https://github.com/notifications/unsubscribe-auth/AABebkf9SiHo59TjD33eOq8t_jNxC73Hks5rR5zvgaJpZM4LjBK- .

jhammock commented 7 years ago

so it's not on the Archive machine? Would that help? It was the sort of thing we got Archive for, as I understand it.

JRice commented 7 years ago

Nope, it has nothing to do with the host; everything to do with the (docker) container configuration. Sorry. :S

On Fri, Jan 13, 2017 at 2:46 PM, Jen Hammock notifications@github.com wrote:

so it's not on the Archive machine? Would that help? It was the sort of thing we got Archive for, as I understand it.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/EOL/tramea/issues/366#issuecomment-272530076, or mute the thread https://github.com/notifications/unsubscribe-auth/AABebnN_ojCf74UZyg7plNZWFnD5lKJiks5rR9SZgaJpZM4LjBK- .

eliagbayani commented 7 years ago

My two cents. I was able to download the hierarchy_entries.tgz from opendata.eol.org on two occasions already. 1st the version without the parent_id and 2nd the version with the parent_id.

But now yes, opendata.eol.org is offline.

jhammock commented 7 years ago

I see opendata is back up. @jhpoelen, you might want to try it again, I suppose. I'm not sure if the errors you met before it went down were related to the problems @JRice mentions

jhpoelen commented 7 years ago

@jhammock thanks for letting me know.

I am getting a 503 - service unavailable when attempting to retrieve the file archive. Please note that the timestamp is in pacific time.

 wget http://opendata.eol.org/hierarchy_entries.tgz
--2017-01-17 19:13:07--  http://opendata.eol.org/hierarchy_entries.tgz
Resolving opendata.eol.org (opendata.eol.org)... 160.111.248.28
Connecting to opendata.eol.org (opendata.eol.org)|160.111.248.28|:80... connected.
HTTP request sent, awaiting response... 503 Service Unavailable
2017-01-17 19:13:12 ERROR 503: Service Unavailable.
jhpoelen commented 7 years ago

Note that clicking on the file though a web browser does seem to start a download... however, this method is not quite suitable for the kind of automation that I had in mind.

Here's another response, this time using:

curl  http://opendata.eol.org/hierarchy_entries.tgz

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html>
  <head>
    <title>503 Service Unavailable</title>
  </head>
  <body>
    <h1>Error 503 Service Unavailable</h1>
    <p>Service Unavailable</p>
    <h3>Guru Meditation:</h3>
    <p>XID: 1730645211</p>
    <hr>
    <p>Varnish cache server</p>
  </body>
</html>
jhpoelen commented 7 years ago

Also, see https://github.com/jhpoelen/eol-globi-data/issues/274#issuecomment-274671229 .

jhpoelen commented 7 years ago

(from https://github.com/jhpoelen/eol-globi-data/issues/274) : I created a data publication with the file at zenodo: DOI so that we can all enjoy uninterrupted citable downloads. As far as I am concerned, we can close this issue, because a viable workaround exists.