References for task 1 - Githubissues

References:

Shannon, C. E. (1948). A Mathematical Theory of Communication. Bell System Technical Journal.

This foundational paper introduced the concept of n-gram models in communication theory, which is the basis for trigram models used in language modeling. Available at: Link Project Gutenberg. (n.d.). Free eBooks.

Source of the five texts used to build the trigram model. Project Gutenberg provides a large collection of public domain books, making it a popular choice for text mining and natural language processing tasks. Available at: https://www.gutenberg.org/ Natural Language Toolkit (NLTK). (n.d.). NLTK: Natural Language Processing with Python.

A comprehensive library for natural language processing in Python, often used for tasks such as text cleaning, tokenization, and working with n-grams. Though not used directly in this project, NLTK provides valuable insights for implementing language models. Available at: https://www.nltk.org/ Python Documentation. (n.d.). The Python Standard Library.

Official Python documentation for tools like defaultdict used in this project for counting trigrams efficiently. Available at: https://docs.python.org/3/library/collections.html#collections.defaultdict Markov Chains for Language Modeling. (n.d.). Towards Data Science: An introduction to Markov chains in NLP.

Explains how Markov chains and n-gram models are used for language generation and prediction, which is directly related to the concept of trigram models. Available at: Markov Chains in NLP SpaCy Documentation. (n.d.). SpaCy: Industrial-strength Natural Language Processing.

Although external libraries were not used, SpaCy's text processing techniques provide inspiration for how to structure and clean data in NLP tasks. Available at: https://spacy.io/

JamesDoonan1 / emerging-technologies

References for task 1 #9