Plan refactoring of file-directory architecture

athityakumar commented 5 years ago

I'll try to collect all my thoughts about the repo structure here, over the next few days.

athityakumar commented 5 years ago

(This comment will probably undergo multiple editions)

deploy.sh
Dockerfile
README.md
LICENSE.md
CONTRIBUTION_GUIDELINES.md

.github/
  ISSUES_TEMPLATE.md
  PR_TEMPLATE.md

.gitignore
.env
.env_example

api/
  graph_formation.py # Merges all graph_formations/ scripts into one
  graph_io.py # Contains networkx-to-neo4j adapters to import graph to database
  sqlite3_adapter.py # Has all data to populate into sqlite3
  db_importer.py # One-time script to populate stuff into neo4j & sqlite databases
  endpoints.py # Handles different endpoints
  flask.py # Handle CORS?
  requirements.txt
  tox.ini

  backend/
    subgraph.py
    abbreviations.py

    graph_formation/ (One-time use only)
      base/
        legal_knowledge_graph.py # includes content of subgraph.py
        judge.py
        case.py
      add_judges.py
      add_key_and_catch_words.py
      add_acts.py
      add_cases.py
      network_analyser.py # Rank important nodes according to pagerank and store them as node attrs
      extract_mappings.py # (Case -> Case id -> Case filename),  (Act -> Year), called by other scripts
      # Probably better to store these mappings into sqlite3 one-time?

    nlp/    
      section_extractor.py
      timeline_extractor.pr
      summarizer.py
      tf_idf.py
      spellcheck.py # Used for both suggesting in front-end & spell-checking in back-end
      query_to_keywords.py
      query_vector_similarity.py
      dependency_parser.py # Processes query

  db_importer/
    mappings.py # Sqlite3
    graph.py # Neo4j
    act_sections.py # Sqlite3, section_id -> act_id -> section_name -> section_description
    acts.py # Sqlite3, act_id -> act_name -> act_year -> act_filepath -> act_state
    cases.py # Sqlite3

  endpoints/
    lawyer_search.py # Just uses subgraph.py
    layman_search.py # Uses nlp/ to extract keywords and then results from subgraph.py
    case.py # GET /cases/{id}
    judge.py # GET /judges/{id}
    act.py # GET /acts/{id}
    keyword.py # GET /keywords/{id}
    catchword.py # GET /catchwords/{id}
    datatable.py # Handle filtering, sort, and pagination

client/ (Yet to plan properly)

athityakumar commented 5 years ago

While deploying:

run cd api; python3 db_importer.py to migrate all data into Mongo & Neo4j run cd api; python3 flask.py to setup back-end run cd client; yarn start to setup front-end

lbs-iitkgp / Opensoft-2019

Plan refactoring of file-directory architecture #54