NASA-IMPACT / COSMOS

COSMOS is a web application designed to manage collections indexed in NASA's Science Discovery Engine (SDE), facilitating precise content selection and allowing metadata modification before indexing.
https://sde-indexing-helper.nasa-impact.net/
3 stars 1 forks source link
bootstrap curation django indexing sde

COSMOS: Curated Organizational System for Metadata and Science

Built with Cookiecutter Django Black code style

COSMOS is a web application designed to manage collections indexed in NASA's Science Discovery Engine (SDE), facilitating precise content selection and allowing metadata modification before indexing.

Basic Commands

Building the Project

$ docker-compose -f local.yml build

Running the Necessary Containers

$ docker-compose -f local.yml up

Non-Docker Local Setup

If you prefer to run the project without Docker, follow these steps:

Postgres Setup

$ psql postgres
postgres=# create database <some database>;
postgres=# create user <some username> with password '<some password>';
postgres=# grant all privileges on database <some database> to <some username>;

# This next one is optional, but it will allow the user to create databases for testing

postgres=# alter role <some username> with superuser;

Environment Variables

Copy .env_sample to .env and update the DATABASE_URL variable with your Postgres credentials.

DATABASE_URL='postgresql://<user>:<password>@localhost:5432/<database>'

Ensure READ_DOT_ENV_FILE is set to True in config/settings/base.py.

Running the Application

$ python manage.py runserver

Run initial migration if necessary:

$ python manage.py migrate

Setting Up Users

Creating a Superuser Account

$ docker-compose -f local.yml run --rm django python manage.py createsuperuser

Creating Additional Users

Create additional users through the admin interface (/admin).

Loading Fixtures

To load collections:

$ docker-compose -f local.yml run --rm django python manage.py loaddata sde_collections/fixtures/collections.json

Manually Creating and Loading a ContentTypeless Backup

Navigate to the server running prod, then to the project folder. Run the following command to create a backup:

docker-compose -f production.yml run --rm --user root django python manage.py dumpdata --natural-foreign --natural-primary --exclude=contenttypes --exclude=auth.Permission --indent 2 --output /app/backups/prod_backup-20241114.json

This will have saved the backup in a folder outside of the docker container. Now you can copy it to your local machine.

mv ~/prod_backup-20240812.json <project_path>/prod_backup-20240812.json
scp sde:/home/ec2-user/sde_indexing_helper/backups/prod_backup-20240812.json prod_backup-20240812.json

Finally, load the backup into your local database:

docker-compose -f local.yml run --rm django python manage.py loaddata prod_backup-20240812.json

Loading the Database from an Arbitrary Backup

  1. Build the project and run the necessary containers (as documented above).
  2. Clear out content types using the Django shell:
$ docker-compose -f local.yml run --rm django python manage.py shell
>>> from django.contrib.contenttypes.models import ContentType
>>> ContentType.objects.all().delete()
>>> exit()
  1. Load your backup database:
$ docker cp /path/to/your/backup.json container_name:/path/inside/container/backup.json
$ docker-compose -f local.yml run --rm django python manage.py loaddata /path/inside/the/container/backup.json
$ docker-compose -f local.yml run --rm django python manage.py migrate

Restoring the Database from a SQL Dump

If the JSON file is particularly large (>1.5GB), Docker might struggle with this method. In such cases, you can use SQL dump and restore commands as an alternative, as described here.

Additional Commands

Type Checks

$ mypy sde_indexing_helper

Test Coverage

To run tests and check coverage:

$ coverage run -m pytest
$ coverage html
$ open htmlcov/index.html

Running Tests with Pytest

$ pytest

Live Reloading and Sass CSS Compilation

Refer to the Cookiecutter Django documentation.

Installing Celery

$ pip install celery

Running a Celery Worker

$ cd sde_indexing_helper
$ celery -A config.celery_app worker -l info

Please note: For Celery's import magic to work, it is important where the celery commands are run. If you are in the same folder with manage.py, you should be right.

Running Celery Beat Scheduler

$ cd sde_indexing_helper
$ celery -A config.celery_app beat

Pre-Commit Hook Instructions

To install pre-commit hooks:

$ pip install pre-commit
$ pre-commit install
$ pre-commit run --all-files

Sentry Setup

Sign up for a free account at Sentry and set the DSN URL in production.

Deployment

Refer to the detailed Cookiecutter Django Docker documentation.

Importing Candidate URLs from the Test Server

Documented here.

Adding New Features/Fixes

We welcome contributions to improve the project! Before you begin, please take a moment to review our Contributing Guidelines. These guidelines will help you understand the process for submitting new features, bug fixes, and other improvements.

Job Creation

Eventually, job creation will be done seamlessly by the webapp. Until then, edit the config.py file with the details of what sources you want to create jobs for, then run generate_jobs.py.

Code Structure for SDE_INDEXING_HELPER

Running Long Scripts on the Server

tmux new -s docker_django

Once you are inside, you can run dmshell.

Later, you can do this to get back in.

tmux attach -t docker_django

To delete the session:

tmux kill-session -t docker_django