Open Necr0x0Der opened 1 year ago
Apparently, we cannot document each concrete instance, e.g., each number.
But we can add same documentation for each token (you probably meant it, just wanted to mention it explicitly). On the other hand for the grounded value tokens we probably need to document types instead of tokens themselves.
- Doc-expressions are mere expressions over strings put arbitrarily into spaces, e.g.
Not sure I understand "cons" properly. We can define doc
as a grounded function and get documentation string from a grounded atom or an atomspace depending on atom which is passed as an argument inside doc
function.
- Structured doc-expressions in a dedicated space.
Another downside of having separate &doc
space is that we need to have one more global space. I wanted to eliminate global spaces inside grounded functions in "minimal MeTTa" as using them causes problems. In particular each imported space should have its own copy of the grounded symbol with &self
space embedded.
I like the structured documentation approach from option (2). But I think we could keep documentation in the same atomspace where the code of the module lives. I very keen to idea of replacing current MeTTa runner by space which has ability to keep/run grounded functions. In this context docs and types of the grounded functions can be represented uniformly.
Before uniform representation is implemented we could divide ways of documenting grounded atoms and pure functions. providing uniform interface through MeTTa runner.
In what format to place such pieces of documentation: doc-strings, doc-expressions with certain type structure, or something else?
To me the most convenient way is to keep documentation as expressions with a specified structure. Thus separate doc formatting function can read data and provide it to the user in a most convenient way.
Where exactly to place them: the library space itself, a global &doc space, a separate doc-space for each library?
I would suggest keeping documentation in the same space where corresponding atoms are kept. In this case documentation hierarchy is the same as hierarchy of modules.
Where MeTTa code for documentation is formed? Should register_token accept it as an additional argument? Should the space being imported just contain doc-expressions as stand-alone expressions similar to type definitions, or should these expressions be somehow places inside equalities like Python docstrings?
As a first step we could provide documentation for the pure symbols explicitly
in an expression form. This allows using current parser without modifications.
For the tokens we don't have a simple solution. We can have multiple tokens
definitions which are effectively represent the same token. Thus adding
documentation to the register_token
call will lead to the code duplication.
As tokens can be duplicated there is a question do we need to document each
registered token. May be it is enough to document the type of the value for the
tokens which produce values. And when token produces the grounded function we
could document the function itself. I would suggest keeping documentation of
the grounded tokens inside accompanying space. For stdlib it is the
metta_code
for the 3rd party modules it should be a part of the module.
How should documentation (+type declarations) be automatically gathered and can missing documentation be automatically detected?
I would say in general we should have each symbol and each grounded atom
documented. Thus when user is asking for a help for some symbol or grounded
atom name the help
function can provide the documentation for this symbol.
help
function.help
function checks the meta-type of the atom. If atom's metatype is
Grounded
then help
gets the type of the atom. There are two options. If the
type is a function type then help
searches documentation by the atom itself.
Otherwise help
searches documentation for the type atom. In a case of the
Symbol
metatype help
could search for the documentation using symbol
itself. For the Expression
metatype help
can try to search documentation
for the whole expression or for the first atom of the expression also depending
on its type.
Documentation of the function:
(function
(description "Description of the function")
(parameters
; name of the parameter is extracted from the function definition
; type of the parameter is extracted from the function type
; in/out effectively each parameter in MeTTa can input and output value
; at the same time but it makes sense sometimes to restrict this
; to only input or only output, it can be done using `sealed`
(parameter "First parameter's description")
[... (parameter "Second parameter's description")]
; type of the returned value is extracted from the function type
(return "Description of the return result")
))
Documentation of the other atoms:
(atom (description "Description of the atom"))
There are different possibilities to link atom to the documentation:
(= (doc <atom>) ...)
which returns a documentation of the
<atom>
(name ...)
to the documentation atom and use it as an anchor to search
documentation from the help
functionI would try to implement help
and documentation examples and finalize
decision after experimenting.
To me the most convenient way is to keep documentation as expressions with a specified structure.
Yes
I would suggest keeping documentation in the same space where corresponding atoms are kept.
OK
As a first step we could provide documentation for the pure symbols explicitly in an expression form.
OK
Proposed documentation format
Looks good. The only concern I have is that tuples are not conveniently deconstructable. OTOH, providing descriptions of each parameter as a separate expression is also not convenient in terms of the parameter order specification.
There is an idea to mark functions as deterministic/non-deterministic. It is similar to in/out
value passing direction. On the one hand we could mark it such in documentation. On the other hand it is a part of the function contract and should be available for the analysis by interpreter thus should be a part of the function definition.
Providing this info in docs in such a way that it looks formal but doesn't influence the interpreter and can be in contradiction with the function behavior (and the function contract if it will include it) looks like a possible source of confusion. OTOH, consistency of documentation and function contracts can be automatically checked (which is also a possible case for formal parameters). Maybe, we should just call it spec
instead of doc
and add any metadata there :)
https://github.com/trueagi-io/hyperon-experimental/pull/694 adds documentation for the standard library.
This is a subissue of #319 .
The goal is to introduce MeTTa code documentation machinery for MeTTa code in general, libraries in particular, and
stdlib
especially.The proposal is to document MeTTa code in MeTTa itself. There are a few choices to be made:
&doc
space, a separate doc-space for each library?register_token
accept it as an additional argument? Should the space being imported just contain doc-expressions as stand-alone expressions similar to type definitions, or should these expressions be somehow places inside equalities like Python docstrings?There are a few issues that may prevent us from answering all these questions atm:
(: ($t A) (B $t))
is a valid expression. All mentioned symbols can be extracted from an entire space, though. Tokens for grounded atoms are regular expressions. Apparently, we cannot document each concrete instance, e.g., each number. The situation may partly change if we introduce bindings as a part of spaces.stdlib
, we can either put doc-expressions in its space or do anything else we want. But will “anything else” work for custom libraries? For the case of importing pure MeTTa scripts, we have no choice but to put doc-expressions into scripts themselves, although it doesn’t necessarily mean that these expressions will remain in the same space. E.g., documenting can be done via import-time execution (that is, with something like! (doc …)
). Also, documentation for libraries and for pure MeTTa code may not necessarily be done identically.Let’s consider some examples, of how it could look like.
1) Doc-expressions are mere expressions over strings put arbitrarily into spaces, e.g.
For
stdlib
, all docs are just put instdlib.rs/metta_code()
. Thus, all docs can be retrieved by matching against&self
since&stdlib
space is places into it. An option is to havedoc
as a function. Pros: no need for special implementation, uniformity over different use cases. Cons: definitions are detached from implementation of grounded functions. The main space is littered with doc-expressions. No formal structure amenable to automatic processing is provided.2) Structured doc-expressions in a dedicated space.
The structure can be different. Some doc-expressions (e.g. module) can be added automatically.
doc
is a grounded function defined instdlib
, which puts its input to&doc
space, that can be called from any script. Then(match &doc (doc …))
can be used to retrieve doc-expressions. We can move in the direction of even more complete formal specification of semantics and pragmatics of symbols and grounded atoms for self-programming. But maybe we just need to keep the possibility for future extensions with doc-expression types explicitly indicating whether the description provided as a string or a richer structure.Pros: possible further automatization and additional formatting of the documentation with possible cross-references, etc. Cons: might be a little bit annoying to follow the structure of
doc
, but can be mitigated by providing syntactic sugar for loose doc-strings.3) In-place documentation, e.g. doxygen-like
Such comments can be processed by the parser in a special way and turned into expressions. Pros: may look convenient and human-readable. Cons: we don’t have single definition per function; special processing may make this harder to extend in the future, etc.
These examples are not mutually exclusive and can be partly combined. There are more detailed to discuss and flesh out, of course.
We can proceed step-by-step and start with agreeing on embedding MeTTa-documentation in MeTTa itself with further automatic extraction to html or whatever.