Trogluddite / loombreaker

Tools for building Topic-Specific Web Indexes (CS-480 Capstone)
MIT License
0 stars 0 forks source link

loombreaker

CS-480 Capstone Project

This project is intended to test some ideas around:

Luddites & Loom Breaking

In the modern era, the Luddites are typically associated with anti-technology and anti-progress stances. Reality is more complex -- automated looms shifted the balance between labor and capital ownership during the industrial revolution. Breaking looms was, arguably, one of very few tactics that laborers could use to advocate for fair working conditions.

If you believe the fourth industrial revolution has arrived, one hopes that democratizing powerful tools can help maintain a healthier balance between capital ownership and the needs of individuals.

Within the scope of CS-480, the target deliverables are:

  1. a custom search engine is configured using open source indexing & crawling tools
  2. documents from the search engine are ingested into markov a bayseian network with links to source docs
  3. markov chains are produced with ranked lists of source documents.

Everything beyond that is a stretch goal.

Problems with web search

Problems with Generative AI

Potential approaches

Note: philosophically, there should be a heavy emphasis on involving human intelligence in the tool-chain – we’re less interested in aesthetics and more interested in using the AI tools produce intermediate solutions that humans will refine.

Generative AI w/ citations

Topic-specific search engines

Generative search, simplest formation

First iterative improvement: user-guided retraining

Second iterative improvement: GANs

Third iterative improvement: Attention & Transformers for summarization

?? I need to know more about how this works.

Fourth Iterative Improvment: retreival augmented generation

Federated Indexing