fititnt / uwazi-docker

Dockerized version of Uwazi (“openness" in Swahili). HURIDOCS designed Uwazi to make human rights information more open and accessible to the defenders who need it.
The Unlicense
11 stars 4 forks source link

Document how to use Elastic Search & MongoDB containers from uwazi-docker with non-containerized uwazi #12

Closed fititnt closed 1 year ago

fititnt commented 6 years ago

The steps to run uwazi from host (e.g. developer machine) with dockerized Elastic Search + MongoDB where deleted on recent commits.

Since the docker plays even nicer for local development, let's document it again, and put here or at the huridocs repository.

fititnt commented 6 years ago

fork-uwazi

fititnt commented 6 years ago
# fititnt at bravo in /alligo/code/fititnt/uwazi on git:development o [17:09:31]
$ yarn hot
yarn run v1.5.1
$ export HOT=true;npm run dev-server & npm run webpack-server

> Uwazi@0.0.1 dev-server /alligo/code/fititnt/uwazi
> nodemon --ignore 'app/dist/*' --watch 'app/api' --watch 'app/shared' --watch 'app/react/ServerRouter.js'

> Uwazi@0.0.1 webpack-server /alligo/code/fititnt/uwazi
> node ./webpack/webpack.server.js

module.js:540
    throw err;
    ^

Error: Cannot find module 'webpack'
    at Function.Module._resolveFilename (module.js:538:15)
    at Function.Module._load (module.js:468:25)
    at Module.require (module.js:587:17)
    at require (internal/module.js:11:18)
    at Object.<anonymous> (/alligo/code/fititnt/uwazi/webpack/webpack.server.js:1:79)
    at Module._compile (module.js:643:30)
    at Object.Module._extensions..js (module.js:654:10)
    at Module.load (module.js:556:32)
    at tryModuleLoad (module.js:499:12)
    at Function.Module._load (module.js:491:3)
npm ERR! code ELIFECYCLE
npm ERR! errno 1
npm ERR! Uwazi@0.0.1 webpack-server: `node ./webpack/webpack.server.js`
npm ERR! Exit status 1
npm ERR! 
npm ERR! Failed at the Uwazi@0.0.1 webpack-server script.
npm ERR! This is probably not a problem with npm. There is likely additional logging output above.
npm WARN Local package.json exists, but node_modules missing, did you mean to install?

npm ERR! A complete log of this run can be found in:
npm ERR!     /home/fititnt/.npm/_logs/2018-04-21T20_13_26_237Z-debug.log
error An unexpected error occurred: "Command failed.
Exit code: 1
Command: sh
Arguments: -c export HOT=true;npm run dev-server & npm run webpack-server
Directory: /alligo/code/fititnt/uwazi
Output:
".
info If you think this is a bug, please open a bug report with the information provided in "/alligo/code/fititnt/uwazi/yarn-error.log".
info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.

# fititnt at bravo in /alligo/code/fititnt/uwazi on git:development o [17:13:27]
$ [nodemon] 1.17.3
[nodemon] to restart at any time, enter `rs`
[nodemon] watching: /alligo/code/fititnt/uwazi/app/api/**/* /alligo/code/fititnt/uwazi/app/shared/**/* app/react/ServerRouter.js
[nodemon] starting `node server.js`
module.js:540
    throw err;
    ^

Error: Cannot find module 'es6-promise'
    at Function.Module._resolveFilename (module.js:538:15)
    at Function.Module._load (module.js:468:25)
    at Module.require (module.js:587:17)
    at require (internal/module.js:11:18)
    at Object.<anonymous> (/alligo/code/fititnt/uwazi/server.js:2:1)
    at Module._compile (module.js:643:30)
    at Object.Module._extensions..js (module.js:654:10)
    at Module.load (module.js:556:32)
    at tryModuleLoad (module.js:499:12)
    at Function.Module._load (module.js:491:3)
[nodemon] app crashed - waiting for file changes before starting...
# fititnt at bravo in /alligo/code/fititnt/uwazi on git:development o [17:18:24]
$ yarn production-build   
yarn run v1.5.1
$ NODE_ENV=production webpack --config ./webpack.production.config.js --progress --profile --colors
sh: 1: webpack: not found
error An unexpected error occurred: "Command failed.
Exit code: 127
Command: sh
Arguments: -c NODE_ENV=production webpack --config ./webpack.production.config.js --progress --profile --colors
Directory: /alligo/code/fititnt/uwazi
Output:
".
info If you think this is a bug, please open a bug report with the information provided in "/alligo/code/fititnt/uwazi/yarn-error.log".
info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.
fititnt commented 6 years ago

yarn hot uwazi welcome screen

fititnt commented 6 years ago

TODO: report that yarn install step must be documented

konzz commented 6 years ago

@fititnt I think the guide should include: yarn blank-state after

# Install node modules used by Uwazi.
yarn install

What do you think? Also, this guide is better than ours 😅

fititnt commented 6 years ago

Good catch! At some point when more stable, we can merge to official docs. Even the docker & docker-compose configs are on public domain (or MIT), so even if I get busy and can't anwser fast and there is sufficient people inside Uwazi to maintain updated huridocs not only could, but should copy even without the need of asking first.

This also means for people who can use this uwazi-docker as base prepare for containerization other great humanitarian FOSS out there.

fititnt commented 6 years ago

Also this tweet can make you and other people aware of what I will say

https://twitter.com/fititnt/status/987747799495213060

Simple reason is this: machine learning systems have much more documentation and tutorials based tech who could have a bad social impact (think Uberization of jobs, unethical mass surveillance, etc) than technology who could mitigate social issues caused by the exponential power of future intelligent systems.

Maybe I'm going too far about how to connect point A to point B, but I think one more more inexpensive way to empower a lot of HFOSS is abstracting part of it's complexity and give some quick starts for non-specialists in the software stack, but good at other tech. For example, a lot of machine learning software are written in Python, not in JavaScript (used in this case by Uwazi Software) so is more harder for expert volunteers help. Or maybe people on universities who could use Uwazi integrated with other systems for some published papers. Or people who could promote AI hackatons in the future and have a few softwares with some work already done, so instead of people writting hackaton code from scratch and never be used again, we could start have some better contributions.

Using Uwazi as example, it's stack also uses Mongo and Elastic Search, and for external people would be nice that uwazi-docker also gives some suggestion on GUI usages, so help on make easier #4 and #5 (thinking in the perspective of non Mongo / Elastic Search previous specialists) would also improve the quick start for possible researchers on the future (or improve chances about Uwazi being selected on AI Hackatons).

Will take some time to me discover other things that could be useful to add at the docker-compose.yml (for who does not know, is a file used for people just use a command to run additional software, e.g. that could allow people use software like Uwazi as part of some experiment platform). In general the people who work with artificial intelligence work do their jobs in softwares like Tensorflow / Theano / Keras, but there are others less powerful but simpler to use to discover patterns without coding (or, if use code, platforms that are less complicated) and that could be embed on "toolkits" like a uwazi-docker draft.

In this case, Uwazi uses 2 databases, Mongo and Elastic Search. Some software who do some machine learning using these data storage or (maybe just other softwares who create nice dashboards based on statistics) could be embed and used without need of put "on the cloud" because even one old computer with 4 or 8GB of RAM could do the work locally (if have access to datasets). Since more HFOSS uses databases like Mongo and Elastic Search, the same knowledge and documentation can serve for multiple projects.

This repository have a public domain as default license on propose. Uwazi is just one of the software I'm looking on, but I'm doing this aiming smaller open source projects. Some of these projects embed some machine learning tools but do not document very well, and maybe some quickstarts or infrastructure as code to use as inspiration could foster results at scale.

txau commented 6 years ago

@fititnt I'm not sure I'm fully following you, but I hope I can provide some useful information.

We are already working in parallel in machine learning implementations for Uwazi. Most of the research we are doing involves, NLToolkit, Fasttext, Word2Vec and Tensorflow. Everything is coded in python in the server side and we integrate it with Uwazi via an http API.

There is a branch of Uwazi with the prototype for the front-end: https://github.com/huridocs/uwazi/tree/ML. This is just a scaffolding for a future production ready feature, but so far so good. At the moment we are focusing in text classification, which comes very handy for tagging large datasets.

Next steps involve feature extraction and entity reconciliation.

As you mentioned in your twitter thread, the idea is to make this technology available for small organizations with limited resources, so they can achieve more with their data.

fititnt commented 1 year ago

The pull request https://github.com/huridocs/uwazi/pull/2559 @vorburger and documentation of https://github.com/huridocs/uwazi#docker already do this.

Is possible to run docker both ways, but https://github.com/huridocs/uwazi is likely have far better documentation to run only the databases on docker while user can edit/contribute to huridocs/uwazi.

This could be documented on fititnt/uwazi-docker, but mostly would means just expose the ports, so not something really complex, so will not do it here.

Closing this issue now