LibreCat / Catmandu

Catmandu - a data processing toolkit
https://librecat.org
177 stars 31 forks source link

Catmandu and Blacklight image #243

Closed panblla closed 3 years ago

panblla commented 8 years ago

What would be nice, is along the vm you provide, to have also another vm, with blacklight installed, like you now do with the vm which has pre-installed mongodb and elasticsearch.

A clean blacklight install. Then, it would be nice to share a fix that would just allow us to upload an mrc, bib or auth, (using the defaults only), into blacklight.

So the scenario i suggest is this:

Along with your so much useful catmandu ova, you add a new ova which has a Catmandu + blacklight installation, along with some documentation, with how to upload an mrc bib and auth file, into it.

This way, the entrance barrier for people would be much lesser. Or something like this approach: https://github.com/LibreCat/docker-catmandu

netsensei commented 8 years ago

I installed Blacklight last month on a Virtualbox/Vagrnat setup. For reference, here are my own installation notes:

# Installing Rails

# See: https://www.digitalocean.com/community/tutorials/how-to-install-ruby-on-rails-with-rbenv-on-ubuntu-14-04

sudo apt-get update

sudo apt-get install git-core curl zlib1g-dev build-essential libssl-dev libreadline-dev libyaml-dev libsqlite3-dev sqlite3 libxml2-dev libxslt1-dev libcurl4-openssl-dev python-software-properties libffi-dev

cd
git clone http://github.com/sstephenson/rbenv.git .rbenv
echo 'export PATH="$HOME/.rbenv/bin:$PATH"' >> ~/.bashrc
echo 'eval "$(rbenv init -)"' >> ~/.bashrc

git clone http://github.com/sstephenson/ruby-build.git ~/.rbenv/plugins/ruby-build
echo 'export PATH="$HOME/.rbenv/plugins/ruby-build/bin:$PATH"' >> ~/.bashrc
source ~/.bashrc

rbenv install -v 2.2.3
rbenv global 2.2.3

echo "gem: --no-document" > ~/.gemrc

gem install bundler
gem install rails

rbenv rehash

# Installing NodeJS

sudo add-apt-repository ppa:chris-lea/node.js
sudo apt-get update

sudo apt-get install nodejs

# Installing Java
# See: https://www.digitalocean.com/community/tutorials/how-to-install-java-on-ubuntu-with-apt-get

sudo apt-get install python-software-properties
sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update
sudo apt-get install nodejs

sudo apt-get install oracle-java8-installer

# SOLR
# See: https://www.digitalocean.com/community/tutorials/how-to-install-solr-5-2-1-on-ubuntu-14-04

wget http://apache.mirror1.spango.com/lucene/solr/5.2.1/solr-5.2.1.tgz
tar xzf solr-5.2.1.tgz solr-5.2.1/bin/install_solr_service.sh --strip-components=2
sudo bash ./install_solr_service.sh solr-5.2.1.tgz
sudo service solr status

CONTINUE following the Quickstart guide https://github.com/projectblacklight/blacklight/wiki/Quickstart

=> Follow "installing the hard way" to get everything setup correctly. I ran into trouble following the "generator / easy way".

I think it should be fairly trivial to transpose these instructions to an Ansible or Puppet playbook for easy setup with Vagrant + provisioning (I don't use Docker personally since I'm on OSX)

netsensei commented 8 years ago

Installing catmandu + blacklight on the same VM box => interesting question.

I think there are only limited use cases where you want to do that. I'm thinking of these conditions:

1/ You don't want to open up the Solr index to secure input from the outside world (catmandu processing on another location) 2/ You don't want to process data on an intermediary box (your own laptop, dedicated VM box,...) before pushing to blacklight.

One use case would be: catmandu fetches data from an outside source i.e. an API, processes it and passes it on to the solr index within the box itself. So, data is pulled to the cat+black box instead of pushed by an action outside the box.

But from a devops perspective, you might want to keep catmandu separate from your blacklight. Why? because running a perl stack and a ruby on rails stack on the same machine could become a maintenance nightmare (thinking of potential version conflicts with shared dependency libraries and the likes)

Also, it's not that hard to secure a Solr endpoint and - depending your needs - "offshore" the catmandu processing to your laptop - if you only infrequently process data - or a dedicated VM - if you process a significant amount of data each night.

phochste commented 8 years ago

Thanks for the tricks! Yes, also at Ghent university we keep things on separate machines: Blacklight, Solr and Catmandu ETL. But I think the question is more about demonstration purposes. The VM that is created now is just to demo stuff. The problem is more to keep the image small so that it can easily be downloaded at workshops or put on (possibly) cheap USB sticks.

nichtich commented 3 years ago

Can this issue be closed?