semantic-math / math-ast

AST for symbolic math
MIT License
7 stars 0 forks source link

braindump #15

Open evykassirer opened 7 years ago

evykassirer commented 7 years ago

Hi Kevin! I took a look as you requested and here are some thoughts I had while going through:

adding 0 is a no-op and can be revemoved

  1. does that mean it's removed right away? or there's a function that can remove it? because sometimes this removing should have a before and after (in mathsteps, a step)

raising 0 to the 0 power is undefined

  1. here are some more indeterminate forms: https://en.wikipedia.org/wiki/Indeterminate_form might be worth adding?

  2. will there be standards about what can be children of other nodes? now that you cover many concepts, it won't make sense for an equation node to be a child of a geometry node for example

  3. plus-or-minus can be binary for things like the quadratic formula right?

  4. would log base 3 of 9 be log(3, 9)? or can we have some way of storing base and argument?

If it's a second derivative with respect to x, then x will appear twice in the variables array.

  1. you could also have derivative of derivative of x (is that more or less intuitive/easy to read?)

  2. what's the difference between what you'd use a list for and a program for?

also the bottom was a bit hard to understand cause there's less formatting

And I found some typos, not sure if that's something you care about right now though:

overall - wow! so thorough. making lists is so fun, I'm glad you're working on this!

kevinbarabash commented 7 years ago

@evykassirer thanks for the braindump!

  1. The useful transformations would be a separate repo from the parser and this AST spec. I was thinking that there are going to be a number of transforms that people will find useful and it seems silly to have them replicated in number of different places. My hope is that other people would create other libraries of transforms that are useful for different parts of math, e.g. trig identities, and then people could grab the transforms that they need from different libraries and only have to worry about how those transforms are applied.

  2. Cool. I'm not sure about whether it makes sense to bake in semantic details like into an AST spec or to have there be a separate library that validates trees as being semantically valid or not. My gut feel here is having it be a separate library and have this repo be simply a spec is probably the way to go.

  3. I define an Expression in https://github.com/kevinbarabash/math-ast/blob/master/spec.md#relation in order prevent this. This should probably be called out more explicitly with a heading for Expression nodes. There are probably other parenting that we want to avoid as well, e.g. geometry nodes can only be children of certain relations, e.g. congruent, incongruent... only lines can be parallel. It probably does make sense to call out semantic details like this in the spec, but actually implement the validation of them in a separate repo.

  4. My thinking with plus or minus is that something like quadratic formula would be represented in the following way:

    (/ (+ (- b) (plus_or_minus (sqrt (+ (^ b 2) (- (* 4 a c))))) (* 2 a))

    I probably messed up the parens there, but basically it would act similar to negation and there could be a flag similar to wasMinus to differentiate between a + \pm b and a \pm b...hopefully we can reuse the same flag but call it something different.

  5. I think for logs we probably want to always store two parameters and somehow mark whether the original input was lg(n) or log_2(n). Whatever we do for this, we should probably have roots mimic it as well. The syntax that is used depends on the parser. Some parsers may choose to go with log(3, 9) while others may support log_3(9). The node rep would look like:

    {
    type: "Function",
    id: { type: "Identifier", name: "log" },
    args: [ { type: "Number", value: "3" }, { type: "Number", value: "9" } ],
    inverse: false,
    }
  6. you could also have derivative of derivative of x (is that more or less intuitive/easy to read?)

Good point. I hadn't thought of that. The derivate stuff is fuzzier. I should update that section to indicate that it's fuzzy and what some different possibilities are the pros-cons of each.

  1. List vs. Program... maybe Proof is a better word than Program, but the idea is it would be a set of mathematical statements which would go on separate lines... each statement could maybe have some text to go along with it or a number. A List would be used for a set of items that isn't a Set. It could also be the child of a Set node as one way of specifying a set. Many of these ideas are very closely related.

What makes a sequence different from a list? A sequence is more specific because it only allows expressions to be items in, but a list could have relations as items in the case of a system of equations. TBH, I think List will probably be split in more specific items such as Sequence, System (of equations), etc. See https://github.com/kevinbarabash/math-parser/blob/master/lib/parse.js#L99-L114.

I need to update the spec in this area. Also, I noticed that some of the links/formatting is messed up towards the bottom.

And I found some typos, not sure if that's something you care about right now though:

Thanks. I'll try to fix those as I make other changes.

overall - wow! so thorough. making lists is so fun, I'm glad you're working on this!

I glad to hear it. I think there's still a lot that could be added, see: https://www.w3.org/TR/MathML/chapter4.html. We could just re-interpret MathML in JSON. One thing that I like about having different node types for relations, operations, and functions it's easier to tell that you should be adding two relations... well I guess you could if when solving a system of equations.

Maybe having these distinctions isn't that important. We definitely wouldn't want to be adding lines together, but that's more about what type of operands are valid for different operations/functions/relations. It seems like operations/functions/relations are necessarily super useful distinctions. When evaluating them, they each have input values and output values.

Anyways, I'd love to get your thoughts on this.

kevinbarabash commented 7 years ago

Some big gaps I noticed in MathML content markup are:

...which we're tackling. ^_^

evykassirer commented 7 years ago

wow MathML is huge! not sure how much time I can invest into really digging into this but @aliang8 can hopefully help you out a bunch! And I'm glad to brainstorm on specific issues or high level ideas and stuff too.

yeah I definitely like the idea of having more detail in what kind of node something is - makes it easier to make sure they're being used properly :) though parsing them might be harder?

anyways - super exciting stuff! let me know if you'd like thoughts on anything else!