hbz / lobid

Linking Open Bibliographic Data
https://lobid.org/
Eclipse Public License 2.0
15 stars 4 forks source link

Fix bulk deployment issues #335

Open fsteeg opened 7 years ago

fsteeg commented 7 years ago

Full bulk downloads currently only work on staging, which is deployed on weywot3. We should update and clean up quaoar1 (would depend on https://github.com/hbz/mabxml-elasticsearch/issues/34), or deploy lobid-resources elsewhere.

fsteeg commented 7 years ago

Stopped Elasticsearch on quaoar1 (see https://github.com/hbz/mabxml-elasticsearch/issues/34), but still only 4 GB were reported as free (and used by Play app, even if it was allowed to use more via Xmx).

Saw that a lot of memory was listed as 'Cached' in watch -n 1 cat /proc/meminfo, cleared that memory with sudo sysctl -w vm.drop_caches=3. Then, about 70 GB were reported as free.

I've allowed the lobid-resources Play apps to use up to 32 GB via the monit config (/etc/monit/conf.d/play-instances.rc). With this, realistic use cases work fine:

curl --header "Accept-Encoding: gzip" "http://lobid.org/resources/search?q=*&owner=DE-290&format=bulk" > Bestand_UB_Dortmund.gz (~5 min)

curl --header "Accept-Encoding: gzip" "http://lobid.org/resources/search?q=*&owner=DE-38&format=bulk" > Bestand_USB_Koeln.gz (~12 min)

This should work fine for the launch (I also switched the staging deployment, which in the meantime was running on weywot3, back to quaoar1 in the vhost config on emphytos). However, the underlying issue remains unsolved. As described and reproduced in https://github.com/hbz/lobid-resources/pull/372, running locally, curled via the network, the implementation allows bulk downloads of the entire uncompressed data (q=*, no owner, no Accept-Encoding) without changing the default (1 GB) memory config.

We should continue working on the deployment setup when we get additional RAM for the weywots, and can move the Play app deployments from quaoar1 to the weywots (we should start on weywot3, which has the most current OS, Java, etc).

So on successful functional review, I suggest we remove launch, and move this to the backlog.

acka47 commented 7 years ago

+1

ChristophEwertowski commented 7 years ago

+1

fsteeg commented 7 years ago

Data for full bulk seems to be incomplete, as reported by @hagbeck via mail:

* knapp 60 Minuten, aber nur 4,13M Datensätze, keine Fehlermeldung