jacobwindsor / pubchem-ranker

Ranks compounds by number of BioAssays or BioSystems in PubChem
MIT License
2 stars 1 forks source link

DOI

PubChem Ranker

This is a simple application built on top of Flask that allows for the ranking of compounds by the amount of BioAssays and BioSystems found in PubChem. A web-based interface is provided for viewing the ranked compounds and some commands for setting up and running the ranker.

Setup

  1. Clone this repository
  2. Make sure you have python3 installed
  3. cd into the project directory and run pip install -r requirements.txt
  4. Go to CompoundRanker/__init__.py and fill in the "ADMIN_EMAIL" setting. Required for Pubchem
  5. run python manage.py initdb to intitialize the database
  6. run python manage.py fillmetabs <path> <name> where path is the absolute path to the CSV file containing the dataset and name is the name you wish to call the dataset
  7. run python manage.py fillcids <name> to gather and fill the CIDs table where name is the dataset name
  8. run python manage.py fillcounts <name> to fill the counts table where name is the name of the dataset you wish to count. Takes a long time
  9. run python manage.py runserver to run the server

Dataset format

Datasets must be in CSV format with each compound on one row. The CAS number takes the first position followed by the IUPAC name in brackets. This data must be in the first column, anything in other columns will be ignored

<CAS> (<IUPAC>)