Support pluggable syntax for Unison

pchiusano commented 5 years ago

Unison's syntax is just used for parsing and pretty-printing of definitions, but the actual definitions are stored in the codebase as their AST. This means we could support multiple syntaxes for the language. Alice could write some Unison code in Python-like syntax, add it to the codebase, and then Bob could view Alice's definition in C-like syntax or Lisp-like syntax. Your ~/.unison file might have syntax = Pythonish in it to control your personal default for reading and writing.

If you want to help make this happen, it would be pretty straightforward. The first step is abstracting the current parser and prettyprinter into an interface, perhaps something like:

data Syntax v = Syntax { 
    parseFile     :: Text -> Either (Parser.Error v) (UnisonFile v Ann)
  -- Pretty (ColorText, Ann) means we can add location-based highlighting after the fact
  , prettyTerm    :: PrettyPrintEnv -> AnnotatedTerm v Ann -> Pretty (ColorText, Ann)
  , prettyType    :: PrettyPrintEnv -> AnnotatedType v Ann -> Pretty (ColorText, Ann)
  , prettyData    :: PrettyPrintEnv -> DataDeclaration' v Ann -> Pretty (ColorText, Ann)
  , prettyAbility :: PrettyPrintEnv -> EffectDeclaration' v Ann -> Pretty (ColorText, Ann) }

So define this (perhaps in Unison.Syntax), and then provide an implementation of it for the current syntax (can put in Unison.Syntax.Default). Then it's just an exercise in plumbing a Syntax value everywhere that the current implementation calls the existing parser and prettyprinter directly (maybe rename or move the existing implementations to make sure that the compiler will alert you to all the places that need changing). Once that's done, the choice of Syntax can become a runtime parameter, chosen on startup by reading a ~/.unison file, command line flag, or something similar.

atacratic commented 5 years ago

Cool!!

aryairani commented 5 years ago

Side note: let's have the user config file and the codebase dirs have different names; they shouldn't both be .unison

aryairani commented 5 years ago

@pchiusano

-- Pretty (ColorText, Ann) means we can add location-based highlighting after the fact

?

pchiusano commented 5 years ago

Using Pretty.map, you can map over the segments of the Pretty. If those segments know their location, you can pick different colors for each location. This could be used for highlighting portions of a type in type error messages, for instance.

seagreen commented 3 years ago

I think this is a fantastic idea, which would bring two benefits:

Support multiple syntaxes (naturally)
Clearly separate out the parser/printer from the rest of the codebase.

A possible preparation step for this would be to move all parsing and printing code into the new Unison.Syntax.* hierarchy. Unison.Syntax itself would expose the fields you listed above, but as functions instead of fields (parseFile, prettyTerm, etc). That would be the only module from Unison.Syntax.* the rest of the code is allowed to import.

Then when we actually add a second syntax, we could take the final step of reifying that interface into the record you describe and plumbing it around.

What do you think? Do you think my intermediate suggestion would be a good incremental goal, or should we go ahead and both separate out the syntax code and plumb the record through at the same time?

pchiusano commented 3 years ago

I'm not totally sure what would be the easiest refactor. However, at the moment I'm doing surgery on the lexer and parser to support the new documentation format, so I'd probably hold off on refactoring until after that merges. (I'm hoping in next few weeks)

seagreen commented 3 years ago

Sounds good to me!

aryairani commented 3 years ago

I could be wrong, but I think it will be hard to even find all of the parsing and printing code unless we are trying to plumb the record through. There are some little bits scattered around, inlined, etc.

seagreen commented 3 years ago

I could be wrong, but I think it will be hard to even find all of the parsing and printing code unless we are trying to plumb the record through. There are some little bits scattered around, inlined, etc.

@aryairani Great point, though on the other hand anything we can do to split this into separate PRs is going to help.

However, at the moment I'm doing surgery on the lexer and parser to support the new documentation format, so I'd probably hold off on refactoring until after that merges. (I'm hoping in next few weeks)

@pchiusano: How's that coming along? Also is there an issue for it, so I can track it's progress?

LoPoHa commented 3 years ago

Would it be possible to move the parsing and printing into unison itself?

In the Haskell runtime you have the AST in a defined format and let unison codebase interact with it (maybe using a special effect handler?) The Haskell runtime then checks the AST if it ok and acts accordingly.

Installing a new syntax would be as simple as installing a library in unison.

seagreen commented 2 years ago

@aryairani: What's the status of this in 2022? If this is still the case:

I could be wrong, but I think it will be hard to even find all of the parsing and printing code unless we are trying to plumb the record through. There are some little bits scattered around, inlined, etc.

...then I'm especially interested in this issue, as getting the parsing/pretty printing separate from everything else should be a solid code clarity win (whatever the particular incremental approach we end up taking).

mitchellwrosen commented 2 years ago

I think that's still the case

ceedubs commented 2 years ago

I think that it's really fascinating that Unison would be so good at supporting pluggable syntax!

But at the same time I actually think that this wouldn't be worth the toll that it could take. Unison code is still going to make its way into textual tools/resources such as StackOverflow, blog posts, GitHub gists, tweets, etc. I think that it would be terribly disorienting to someone trying to learn Unison to see two similar StackOverflow answers with wildly different syntaxes.

It pains me to say it, but while this is a cool possibility, honestly I feel like it should be reserved for minor formatting adjustments (indentation/line width, etc).

asampal commented 2 years ago

But at the same time I actually think that this wouldn't be worth the toll that it could take. Unison code is still going to make its way into textual tools/resources such as StackOverflow, blog posts, GitHub gists, tweets, etc. I think that it would be terribly disorienting to someone trying to learn Unison to see two similar StackOverflow answers with wildly different syntaxes.

Being held back by legacy mediums seems a regrettable approach if one of the goals of Unison is to advance the state of the art. Why not post links/embeds in these from a source that knows how to deal with Unison?

bearror commented 2 years ago

This is, in my opinion, the missing piece from Unison at the moment.

Doing away with naming, and moving towards a structure that can describe the entirety of a system. Why tie that down to a single representation? With something like Unison, we could finally work on systems from various perspectives without any of them having to own (their little part of) it. For an example, UX designers work with tools that represent interactions that are then translated into code; often by other people. It shouldn't be inconceivable to have UX designers interface with the codebase directly, given that interactions are precisely a perspective into the system that is being built. Same goes for project management, security, etc...

At my current work, we deal with clinicians who express medical knowledge as executable rules. Having to learn a textual syntax has consistently been a bottleneck, whereas diagrammatic representations tend to come more naturally. In cases like these, too, it would make a lot of sense to cut the middle-man and embrace that the single source-of-truth of the system may have many representations. Different people are involved; interested in different things; working in different ways; accomplishing different goals. Why dictate that the system must be canonically described through a textual syntax – especially after going so far to make that unnecessary? Unison might expose a standard way of interfacing with the codebase, and leave it up to the user to build the many representations that make sense in their given contexts. (Sure, they could do that even if the interfacing happens through a textual syntax, but I think that'd be a missed opportunity.)

I don't think I'm alone with these thoughts, as even Alan Kay mentioned the work of Bret Victor on the future of user interfaces. I've personally found the talk on "The Humane Representation of Thought" particularly insightful.

Erudition commented 1 year ago

Unison already requires a different way (Manager) to interact with the code base rather than just editing text, So!

Existing code from other functional languages (e.g. Elm, my everyday lang) can just be dropped into Unison (using the Elm names/syntax) and bam, it could run parallel and distributed
This feature could pave the way for a fully visual programming experience, like Enso but distributed (not just in computations but simultaneous development).
- Names could be colored/textured by the content hash they point to, so you can instantly know when two same-named symbols are actually different, without having to read a hash.
- Functions can be visually rearranged in the most readable style with mere drag-n-drop, so there's no need for pipelining/point-free/parenthesis as separate concepts.

chuwy commented 1 year ago

Unison code is still going to make its way into textual tools/resources such as StackOverflow, blog posts, GitHub gists, tweets, etc. I think that it would be terribly disorienting to someone trying to learn Unison to see two similar StackOverflow answers with wildly different syntaxes.

While I agree here with @ceedubs - it indeed would make things harder, but wouldn't also the remediation be as easy as putting the unknown syntax into ucm and printing back as preferred syntax? Not perfect UX, but what's more important - I'm sure that the canonical syntax will be holding a major part of the market and everything else will be esoteric CoffeeScript's that only enthusiasts care about.

rooney commented 9 months ago

I think that it's really fascinating that Unison would be so good at supporting pluggable syntax!

But at the same time I actually think that this wouldn't be worth the toll that it could take. Unison code is still going to make its way into textual tools/resources such as StackOverflow, blog posts, GitHub gists, tweets, etc. I think that it would be terribly disorienting to someone trying to learn Unison to see two similar StackOverflow answers with wildly different syntaxes.

It pains me to say it, but while this is a cool possibility, honestly I feel like it should be reserved for minor formatting adjustments (indentation/line width, etc).

We could have a browser extension that translates between Uni-syntaxes

aryairani commented 9 months ago

I don't see that they do this anymore, but Microsoft used to have a toggle in their web docs for switching examples between different .NET languages.

micklat commented 7 months ago

I'm considering of adding Unison as a plugin language for AnyType and I think having a more conventional syntax would be a definite advantage in that role.

aryairani commented 7 months ago

@micklat That sounds cool. What does that mean?

micklat commented 7 months ago

@micklat That sounds cool. What does that mean?

More explanations in the unison discord, here. If that doesn't clarify do ask me anything.

unisonweb / unison

Support pluggable syntax for Unison #499