rakshanaj17 / cdl-platform

1 stars 0 forks source link

TextData

TextData is a social platform for collaboratively creating, sharing, and learning from wiki-style communities. We offer a stand-alone website and a Chrome extension, all for free.

To use TextData, you have three options:

  1. Full online version: Visit textdata.org, install the Chrome extension, create an account, and begin using TextData.
  2. Full offline version: Clone this repository, set up Docker, and run the services locally. This is described in the section below titled "Setting Up the Local Version".
  3. Hosted backend, local frontend: You can leverage the APIs for the backend, and create or extend your own frontend (website or browser extension). The API documentation is here.
Setting Up the Offline Version
## Setting Up the Offline Version Note that the local version is still under development: - No data is persisted; once the Docker containers stops, all data is lost. - "Reset Password" will not work due to no access to SendGrid. ### Requirements for running locally - [Docker](https://www.docker.com/) and [Docker Compose](https://docs.docker.com/compose/) - [Docker Desktop](https://www.docker.com/products/docker-desktop/) for Windows - [Configuring Docker for OpenSearch](https://opensearch.org/docs/latest/install-and-configure/install-opensearch/docker/). If you are running on Windows (and thus using WSL for Docker), then follow [these directions](https://github.com/docker/for-win/issues/5202) for increasing vm.max_map_count. ``` open powershell wsl -d docker-desktop sysctl -w vm.max_map_count=262144 ``` - With all of the Docker containers, packages, and models, the total size is ~10GB. Without Neural, it is ~3GB. ### Configuring the env files Copy the following to ``backend\env_local.ini``: ``` api_url=http://localhost api_port=8080 webpage_port=8080 jwt_secret=0047fa567bbddf121b23d1deaa7ff2af redis_host=redis redis_port=6379 redis_password=admin cdl_uri=mongodb://mongodb:27017/?retryWrites=true&w=majority cdl_test_uri=mongodb://localhost:27017 db_name=cdl-local elastic_username=admin elastic_password=admin elastic_index_name=submissions elastic_webpages_index_name=webpages elastic_domain=http://host.docker.internal:9200/ elastic_domain_backfill=http://localhost:9200/ ``` Copy the following to ``frontend\website\.env.local``": ``` NEXT_PUBLIC_FROM_SERVER=http://host.docker.internal:8080/ NEXT_PUBLIC_FROM_CLIENT=http://localhost:8080/ ``` Copy the following to ``frontend\extension\.env.local`` ": ``` REACT_APP_URL=http://localhost:8080/ REACT_APP_WEBSITE=http://localhost:8080/ ``` ### Starting the services #### Website, backend API, MongoDB, and OpenSearch: Add the following to ``docker-compose.yml``: ``` services: redis: image: redis:alpine command: redis-server --requirepass admin restart: always ports: - '6379:6379' reverseproxy: image: reverseproxy build: context: ./reverseproxy dockerfile: Dockerfile-local ports: - 8080:8080 restart: always mongodb: image: mongo ports: - 27017:27017 restart: always command: mongod --bind_ip 0.0.0.0 opensearch-node1: image: opensearchproject/opensearch:latest container_name: opensearch-node1 environment: - cluster.name=opensearch-cluster - node.name=opensearch-node1 - discovery.seed_hosts=opensearch-node1 - cluster.initial_cluster_manager_nodes=opensearch-node1 - bootstrap.memory_lock=true - DISABLE_SECURITY_PLUGIN=true - "OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx512m" ulimits: memlock: soft: -1 hard: -1 nofile: soft: 65536 # Maximum number of open files for the opensearch user - set to at least 65536 hard: 65536 ports: - 9200:9200 # REST API website: depends_on: - reverseproxy image: website build: ./frontend/website env_file: ./frontend/website/.env.local restart: always extra_hosts: - "host.docker.internal:host-gateway" api: depends_on: - reverseproxy - redis - mongodb - opensearch-node1 image: api build: ./backend restart: always env_file: ./backend/env_local.ini ``` Run the docker-compose file: ``docker-compose -f docker-compose.yml up -d --build`` To stop: ``docker-compose -f docker-compose.yml down`` If you would like to add the neural options (reranking, generation), add the following to ``backend\env_local.ini`` and restart docker-compose: ``` neural_api=http://host.docker.internal:9300/ ``` Note that the slashes need to be reversed if running on Mac/Linux (above is written for windows). Then, navigate to the neural folder, add the following to ``env_neural_prod.ini`` ``` hf_token= ``` Finally, to start the neural docker container (requires GPU), run the following from the ``neural`` folder: ``` docker build -t . docker run --gpus --env-file env_neural_prod.ini -p 9300:80 hash_of_above_image ``` #### Extension: Navigate to ``frontend\extension`` and run ``npm ci`` and then run ``npm run build``. Then upload the ``build`` file to Chrome while using Development Mode. Once uploaded, open the extention, go to the setting section and chnage the backend source from textdata.org to other and click on Save. ### Running Test cases Note: Local Docker containers must be up and running before you run below commands ``` cd \backend pytest .\tests\test_server.py ``` ### Running the Back-Fill script Note: Local Docker containers must be up and running before you run below commands ``` cd \backend python .\app\helpers\backfill.py [--env_path] [--type=<"submissions" or "webpages">] ``` Here, `--env_path` is an optional argument that takes the path to the environment file, and the default file considered is `backend\env_local.ini`. The `--type` is another optional argument that takes two values: `submissions` or `webpages`, and the default value is `submissions`.
Building on Top of the Hosted Backend
## Building on Top of the Hosted Backend See the API documentation [here](https://github.com/thecommunitydigitallibrary/cdl-platform/tree/dev/backend). Please be courteous regarding the amount of API calls so that the backend servers do not get overwhelmed.

Contributors

How can I contribute?
For any single bug fix or small feature: fork this repository, make a pull request, and describe the change in the request. For a longer-term collaboration, big feature, or large change, please send an email to ``kjros2@illinois.edu``.

Publications