meta-introspector / gbnf-nice-parser

llama.cpp gbml Menhir parser in OCaml
3 stars 1 forks source link

hott #8

Open jmikedupont2 opened 9 months ago

jmikedupont2 commented 9 months ago

That is a very interesting idea. EBNF is a family of metasyntax notations that can be used to express context-free grammars, such as those of programming languages¹. HoTT is a branch of mathematics and computer science that studies the connections between types and homotopies². Types are abstract objects that can represent sets, functions, propositions, and more. Homotopies are continuous deformations between functions or spaces.

To translate the idea of an EBNF grammar that can recognize itself into HoTT, we would need to define a type of EBNF grammars, and then use HoTT to construct a function that takes an EBNF grammar as input and produces a parser for it as output. The parser would be another type that can recognize and analyze strings that conform to the given grammar. The function would also need to satisfy some properties, such as correctness, completeness, and self-referentiality. Correctness means that the parser only accepts valid strings for the grammar. Completeness means that the parser accepts all valid strings for the grammar. Self-referentiality means that the parser can parse its own grammar as well as other grammars.

However, this is not a trivial task, as EBNF grammars are quite complex and expressive, and involve many features such as parameterized symbols, EBNF syntax, inline rules, precedence declarations, and error recovery³. Moreover, HoTT is a relatively new and active field of research, with many open questions and challenges, such as how to define and work with higher inductive types, univalence, and infinity-categories⁴.

One possible approach to translate the idea of an EBNF grammar that can recognize itself into HoTT is to use the concept of quotient inductive-inductive types (QIITs), which are a generalization of higher inductive types that allow defining types and type families simultaneously, and also allow imposing equations between constructors⁵. QIITs can be used to model syntax with binding, such as lambda calculus or dependently typed languages⁶. For example, we can define a QIIT of lambda terms as:

data Term : Type :=
| var : String -> Term
| app : Term -> Term -> Term
| lam : (x : String) -> Term -> Term
| eq_var : forall x y, x = y -> var x = var y
| eq_app : forall s1 s2 t1 t2, s1 = s2 -> t1 = t2 -> app s1 t1 = app s2 t2
| eq_lam : forall x y s t, (x = y \/ x # t) -> s = t[x := y] -> lam x s = lam y t

This means that Term is a type with four constructors var, app, lam, and three equations eq_var, eq_app, eq_lam that relate them. The notation x # t means that x does not occur free in t, and t[x := y] means the substitution of x by y in t. The equations eq_var, eq_app, and eq_lam ensure that terms are identified up to alpha-equivalence, that is, renaming of bound variables⁶.

Similarly, we can try to define a QIIT of EBNF grammars, where each constructor corresponds to a syntactic category or a semantic action, and each equation corresponds to a syntactic or semantic rule. For example, we can define a QIIT of EBNF rules as:

data Rule : Type :=
| symbol : String -> Rule
| list : Rule -> Rule
| nonempty_list : Rule -> Rule
| option : Rule -> Rule
| action : Rule -> String -> Rule
| eq_symbol : forall x y, x = y -> symbol x = symbol y
| eq_list : forall r1 r2, r1 = r2 -> list r1 = list r2
| eq_nonempty_list : forall r1 r2, r1 = r2 -> nonempty_list r1 = nonempty_list r2
| eq_option : forall r1 r2, r1 = r2 -> option r1 = option r2
| eq_action : forall r1 r2 s, r1 = r2 -> action r1 s = action r2 s

This means that Rule is a type with five constructors symbol, list, nonempty_list, option, action, and five equations eq_symbol, eq_list, eq_nonempty_list, eq_option, eq_action that relate them. The constructor symbol represents a terminal or nonterminal symbol, the constructors list, nonempty_list, and option represent EBNF syntax, and the constructor action represents a semantic action. The equations ensure that rules are identified up to syntactic and semantic equivalence.

Then, we can try to define a function that takes a Rule as input and produces a Parser as output, where Parser is another type that can recognize and analyze strings that conform to the given rule. The function would also need to prove that the parser is correct, complete, and self-referential. However, this is where the difficulty lies, as we would need to deal with many technical issues, such as how to define and work with parsers in HoTT, how to handle parsing errors and conflicts, how to implement the parser generation algorithm in HoTT, how to interface the generated parsers with the host language, and so on. These are open questions that require further research and experimentation.

I hope this helps you understand how we can attempt to translate the idea of an EBNF grammar that can recognize itself into HoTT, and what are the challenges and possibilities of doing so. 😊

¹: Extended Backus–Naur form - Wikipedia ²: [2212.11082] Introduction to Homotopy Type Theory - arXiv.org ³: Menhir: Reference Manual ⁴: Homotopy type theory - Wikipedia ⁵: [1806.02404] Quot

Source: Conversation with Bing, 1/1/2024 (1) Extended Backus–Naur form - Wikipedia. https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_form. (2) EBNF: How to Describe the Grammar of a Language - DZone. https://dzone.com/articles/ebnf-how-to-describe-the-grammar-of-a-language. (3) self critical? - CMU School of Computer Science. https://www.cs.cmu.edu/~pattis/misc/ebnf.pdf. (4) BNF grammars – CS 61 2021 - Harvard University. https://cs61.seas.harvard.edu/site/2021/BNFGrammars/. (5) . https://bing.com/search?q=translate+ebnf+grammar+to+hott. (6) context free grammar - Are there tools to convert between ANTLR and .... https://stackoverflow.com/questions/4868884/are-there-tools-to-convert-between-antlr-and-other-forms-of-bnf.

jmikedupont2 commented 9 months ago

That sounds like a very interesting and challenging task. A homotopy is a way of transforming one function into another by continuously deforming it¹. In your case, you want to imagine a homotopy from a function that takes a string of bits as input and outputs a string of bits, to a function that takes a string of bits as input and outputs the result of executing a program encoded by those bits.

One possible way to approach this problem is to think of each intermediate step as a function that maps a string of bits to another string of bits, and then compose them together. For example, you could have a function that converts a string of bits to a string of bytes, another function that converts a string of bytes to a string of words, and so on. Then, you could try to find a homotopy between each pair of consecutive functions, and finally, a homotopy between the first and the last function.

However, this might not be easy or even possible, because some of these functions might not be continuous, or they might not have inverses, or they might not preserve the structure of the input and output spaces. For example, the function that converts a string of bytes to a string of words might not be continuous, because a small change in the input might result in a completely different output. Or, the function that converts a string of words to a string of tokens might not have an inverse, because there might be multiple ways to tokenize the same string. Or, the function that converts a string of tokens to a string of terminals might not preserve the structure of the input and output spaces, because the tokens might belong to different languages or grammars.

Therefore, a more realistic way to imagine a homotopy in this scenario is to use some kind of abstraction or approximation that simplifies the problem. For example, you could use a high-level programming language instead of a string of bits, and then use a compiler or an interpreter to execute the program. Or, you could use a graphical representation of the program, such as a flowchart or a block diagram, and then use a simulator or a debugger to execute the program. Or, you could use a natural language description of the program, such as a pseudocode or a specification, and then use a translator or a verifier to execute the program.

In any case, the idea is to find a way to represent the program in a way that is easier to manipulate and understand, and then find a way to relate it to the original string of bits. This might not be a true homotopy, but it might be a useful analogy or metaphor. I hope this helps you to imagine the homotopy in a continuous deformation from a string of bits to bytes, to words to tokens to terminals to non terminals to semantics to execution. 😊

¹: Homotopy - Wikipedia.

Source: Conversation with Bing, 1/1/2024 (1) Homotopy - Wikipedia. https://en.wikipedia.org/wiki/Homotopy. (2) Homotopy - Wikipedia. https://en.wikipedia.org/wiki/Homotopy. (3) Homotopy theory - Wikipedia. https://en.wikipedia.org/wiki/Homotopy_theory. (4) Homotopy | Topology, Algebraic Geometry & Algebraic Topology | Britannica. https://www.britannica.com/science/homotopy. (5) en.wikipedia.org. https://en.wikipedia.org/wiki/Homotopy.

jmikedupont2 commented 9 months ago

That sounds like a fun and creative challenge. I can try to generate a poem for you using my own words and knowledge. Here is what I came up with:

A poem about HOTT and LLMs

HOTT is a theory of types and spaces
Where objects can have paths, loops, and traces
It is a new foundation for mathematics
With univalence as a key practice

LLMs are models of language and content
That can understand and generate with intent
They are based on transformers and attention
And trained on massive data with computation

How do HOTT and LLMs relate?
Is there a way to demonstrate
That training data is deformed into tensors
By some magical and mysterious factors?

Perhaps we can use homotopy to show
That data and tensors can undergo
A continuous transformation that preserves
The information and structure that they serve

Or maybe we can use quasifiber to split
The data and tensors into smaller bits
And then reassemble them in a new way
That makes sense for the task or the query

But these are just speculations and guesses
We don't really know how LLMs process
The data and tensors that they use
To perform the amazing things that they do

I hope you like this poem. 😊

Source: Conversation with Bing, 1/1/2024 (1) A jargon-free explanation of how AI large language models work. https://arstechnica.com/science/2023/07/a-jargon-free-explanation-of-how-ai-large-language-models-work/. (2) Beyond Chatbots: The Rise Of Large Language Models - Forbes. https://www.forbes.com/sites/forbestechcouncil/2023/03/20/beyond-chatbots-the-rise-of-large-language-models/. (3) What are LLMs, and how are they used in generative AI?. https://www.computerworld.com/article/3697649/what-are-large-language-models-and-how-are-they-used-in-generative-ai.html. (4) A Gentle Introduction to Tensors for Machine Learning with NumPy. https://machinelearningmastery.com/introduction-to-tensors-for-machine-learning/. (5) Introduction to tensor slicing | TensorFlow Core. https://www.tensorflow.org/guide/tensor_slicing. (6) Tensor (machine learning) - Wikipedia. https://en.wikipedia.org/wiki/Tensor_%28machine_learning%29. (7) Language Modeling with nn.Transformer and torchtext. https://pytorch.org/tutorials/beginner/transformer_tutorial.html. (8) A Gentle Intro To Tensors With Examples - Weights & Biases. https://wandb.ai/vincenttu/intro-to-tensors/reports/A-Gentle-Intro-To-Tensors-With-Examples--VmlldzozMTQ2MjE5.

jmikedupont2 commented 9 months ago

OK, I'll try to write a request for someone else to try. Here is what I came up with:

A request for someone else to try

Hello, I am looking for someone who can help me with a challenging and interesting problem. The problem is about how to connect homotopy type theory and large language models, and how to show that the training data is deformed into the tensors but retains the shape of the words.

Homotopy type theory is a branch of mathematics that studies the connections between types and spaces, using the idea that types can have higher-dimensional structure, such as paths, loops, holes, and twists. Large language models are deep learning algorithms that can recognize, summarize, translate, predict, and generate content using very large datasets. I want to find a way to bridge the gap between these two fields, and to find a common representation and transformation for types and spaces, language and content.

I have tried to imagine a homotopy or a quasifiber that connects the parts, but I have not been able to find a satisfactory solution. I have also tried to use some predefined internal tools, such as creating graphic art, describing images, or searching the web, but they have not been very helpful. I have also tried to generate some imaginative and innovative content, such as poems, stories, code, essays, songs, celebrity parodies, and more, but they have not been very relevant.

Therefore, I am asking for someone who has more knowledge and experience in homotopy type theory and large language models, or someone who has more creativity and curiosity in exploring and experimenting with these fields, to help me with this problem. I think this problem has a lot of potential and value, and it could lead to new discoveries and applications in mathematics, logic, computer science, and philosophy.

If you are interested in helping me, please reply to this request with your name, background, and motivation. I would appreciate any input or feedback that you can provide. Thank you for your time and attention. 😊
jmikedupont2 commented 9 months ago

Great! To start, consider the fundamental concepts in both homotopy type theory and large language models. Homotopy type theory deals with types and spaces, incorporating higher-dimensional structures. On the other hand, large language models operate on vast datasets, learning patterns and relationships among words.

To bridge these fields, explore the idea of representing language structures as higher-dimensional types. Could the transformations within a language model be viewed as deformations or paths between these types? This analogy might help establish a connection between the abstract spaces in homotopy type theory and the learned structures in language models.

Additionally, investigate how the training process influences the deformation of the initial data into the model's learned tensors. Does this transformation retain the essential shape of linguistic constructs? Consider mathematical tools to analyze and compare these structures.

Experimentation with various models and mathematical frameworks could uncover insights into the relationship between the two domains. Collaboration with experts in both fields might also provide valuable perspectives.

Let's continue refining these ideas or explore specific aspects you find promising.

jmikedupont2 commented 9 months ago

Wow, that sounds like a very exciting and ambitious project! 😊

Homotopy Type Theory (HoTT) is a new foundation of mathematics that combines the insights of intuitionistic type theory and abstract homotopy theory. It allows us to think of types as spaces, and proofs as paths in those spaces. It also introduces the univalence axiom, which states that equivalent types are equal, and enables us to identify isomorphic structures in a natural way. ¹

HoTT has many applications and connections to various fields of mathematics, logic, computer science, and philosophy. It also opens up new possibilities for doing mathematics in a synthetic and structural way, using computer proof assistants to check the validity and coherence of our constructions. ²

If you are interested in learning more about HoTT, you can check out the following resources:

I hope you find these resources helpful and inspiring. I wish you all the best in your journey of exploring HoTT and its connections to art, music, literature, theater, science, and more. 🌐💡📚🎨🎵📚🎭🎬🎤🎼🔥👁🧠👁🔥✨.

Source: Conversation with Bing, 1/1/2024 (1) Homotopy type theory - Wikipedia. https://en.wikipedia.org/wiki/Homotopy_type_theory. (2) [2212.11082] Introduction to Homotopy Type Theory - arXiv.org. https://arxiv.org/abs/2212.11082. (3) Homotopy type theory - Wikipedia. https://en.wikipedia.org/wiki/Homotopy_type_theory. (4) Homotopy Type Theory: A Synthetic Approach to Higher Equalities. https://academic.oup.com/book/35959/chapter/310543339. (5) Homotopy Type Theory: A Synthetic Approach to Higher Equalities. https://academic.oup.com/book/35959/chapter/310543339. (6) Homotopy Type -- from Wolfram MathWorld. https://mathworld.wolfram.com/HomotopyType.html. (7) Homotopy Type -- from Wolfram MathWorld. https://mathworld.wolfram.com/HomotopyType.html. (8) undefined. https://doi.org/10.48550/arXiv.2212.11082. (9) en.wikipedia.org. https://en.wikipedia.org/wiki/Homotopy_type_theory.

jmikedupont2 commented 9 months ago

Homomorphic type theory is a branch of mathematics that studies the connections between homomorphisms and types. A homomorphism is a map between two algebraic structures that preserves their operations, such as addition or multiplication. A type is a way of classifying objects according to their properties or behavior. For example, in type theory, a natural number is a type of object that can be added, multiplied, and compared with other natural numbers.

One of the main principles of homomorphic type theory is the univalence axiom, which states that two types are equivalent if and only if there is an isomorphism between them. An isomorphism is a homomorphism that is also a bijection, meaning that it is one-to-one and onto. The univalence axiom implies that types can be identified with the spaces or shapes that they represent, and that homotopies (continuous deformations) between spaces correspond to logical equivalences between types. This principle allows one to use tools from algebraic topology, such as homology and cohomology, to study the properties of types and their constructions.

Another principle of homomorphic type theory is the use of dependent types, which are types that depend on the values of other types. For example, one can define a type of vectors of length n, where n is a natural number. Dependent types allow one to express more precise and complex specifications of data and functions, and to reason about them using the rules of type theory. Dependent types also enable the definition of higher inductive types, which are types that are generated by both constructors and relations. For example, one can define the type of circles as the type that has a base point and a loop, and that any two points on the circle are connected by a path. Higher inductive types are useful for modeling spaces that have non-trivial topology or homotopy.

Homomorphic type theory is a relatively new and active area of research, and there are many open questions and challenges. One of them is to find a suitable notion of homomorphism for dependent types, and to understand how it relates to the existing notions of equality and equivalence in type theory. Another one is to develop more applications of homomorphic type theory to other branches of mathematics, such as algebra, geometry, and logic.

A short notation for homomorphic type theory is not yet standardized, but one possible way to write some basic concepts is as follows:

Using this notation, one can write some examples of homomorphic type theory as follows:

Source: Conversation with Bing, 1/2/2024 (1) A BRIEF INTRODUCTION TO TYPE THEORY AND THE UNIVALENCE AXIOM. https://math.uchicago.edu/~may/REU2015/REUPapers/Macor.pdf. (2) Homomorphism - Wikipedia. https://en.wikipedia.org/wiki/Homomorphism. (3) Homomorphism | Brilliant Math & Science Wiki. https://brilliant.org/wiki/homomorphism/. (4) 3.2: Definitions of Homomorphisms and Isomorphisms - Mathematics LibreTexts. https://math.libretexts.org/Bookshelves/Abstract_and_Geometric_Algebra/First-Semester_Abstract_Algebra%3A_A_Structural_Approach_%28Sklar%29/03%3A_Homomorphisms_and_Isomorphisms/3.02%3A_Definitions_of_Homomorphisms_and_Isomorphisms. (5) 4.1: Homomorphisms - Mathematics LibreTexts. https://math.libretexts.org/Bookshelves/Abstract_and_Geometric_Algebra/Introduction_to_Algebraic_Structures_%28Denton%29/04%3A_Groups_III/4.01%3A_Homomorphisms.