TheMatten / hask

Funny little Haskell impl
GNU General Public License v3.0
18 stars 0 forks source link

Initial source AST implementation #4

Closed TheMatten closed 3 years ago

TheMatten commented 3 years ago

Let's start with the AST! I've tried to follow the Haskell98 spec where applicable, but I may have missed something, so please make sure that no syntax is missing/incorrect. There're few TODOs around the PR that mention things I wasn't certain about - we probably want to resolve these before merging anything. I've tried to keep things simple and to not try to encode information that would be hard to get in parser, but at the same time I've tried to make types as precise as possible and did exploit possibilities to reduce repetitiveness where possible, specifically:

I've put documentation everywhere - please check the grammar if you can (I'm not a native speaker :sweat_smile:) and just tell me if you think that some of it isn't really necessary - I'm trying to make things easily understandable for readers "from the outside", and by themselves, e.g. in case they appear in hoogle search or you get to them from random link (and completely different context). Tell me what you think about approach to notes - we'll probably need those throughout the codebase later if not now, and this way they would be easily accessible not only from code (by searching for Haddock's #anchors#), but from generated docs too.

I've avoided any sort of shortened terms - this AST is to be manipulated by the compiler, not the user, and so I value clarity over terseness. Please let me know what you thing about naming style we should use.

And we probably want some formatting standard too, so let's use this code to figure it out.

One more thing - I've created separate package for source-level stuff - I think it's worth it if we want any sort of modularity in the future, whether it's to allow others to use parts of the compiler or whether it's for us to try different approaches in parts of pipeline simultaneously (just look at poor people trying to use GHC as a library).

TheMatten commented 3 years ago

Of course I don't support variables in patterns - let me fix that :smile:

TheMatten commented 3 years ago

@Boarders I'll write general answer here - those places where I've kept Type instead of something more concrete are that way by design. I would like to work on level closer to what user expects - that what they write is some type - and then point them to problems in that type formulation later when converting to actual representation used by the compiler. Erroring out with high-level error that describes how those declarations should look like and why do they violate that, not that we found unexpected <random token>. Maybe I'm overthinking this, maybe we could actually apply this principle in more places, maybe such approach has more possible use-cases (like more relaxed IDE coloring). I feel free to do so in parser AST, because we can later make this stuff properly precise further down the pipeline.

TheMatten commented 3 years ago

I imagine we may want to work with phases like

core AST
     ^
     |
minimal, not-directly-printable source AST for desugaring (typechecking?)
     ^
     |
precise renamed (and typechecked?) AST
     ^
     |
relaxed parsed AST

In here, parsed AST fits as something that can support syntax user may write in misunderstanding (but not one written by cat running over keyboard :slightly_smiling_face:)

TheMatten commented 3 years ago

BTW, @Boarders thanks for pointing these out, because it's an important point :slightly_smiling_face:.

Boarders commented 3 years ago

@TheMatten : That sounds good to me and makes sense now I know what the aim was here. Thank you for getting the ball rolling here, this is a great start :).

TheMatten commented 3 years ago

Ok, one more question before I mark this ready for review - Those _Names we parse technically aren't just strings - they need to be well formed. Should I turn them into smart constructors? I mean, something like:

newtype VariableName = UnsafeVariableName Text

pattern VariableName :: Text -> VariableName
pattern VariableName{ variableNameText } <- UnsafeVariableName variableNameText

where VariableName can only be pattern-matched and UnsafeVariableName means "you should first check that you can't do this using existing functions" and "this needs more care during review". BTW, this in general could be nice pattern to adopt in places where precise type would be complicated/unusable.

TheMatten commented 3 years ago

I've separated fields types with strictness into separate record, removing need for Strictness, fixed strictness on unnamed fields and made name constructors unsafe - I think it should be ready for final review.

Boarders commented 3 years ago

Great, looks good to me!

solomon-b commented 3 years ago

Why is the newtype's newtypeConstructorLabel a maybe?

data NewtypeConstructor = NewtypeConstructor{
    newtypeConstructorName  :: ConstructorName
  , newtypeConstructorLabel :: Maybe VariableName
  , newtypeConstructorField :: Type
  } deriving stock Show

edit: err, i was thinking it was the data constructor name not a type variable. Where is the Data Constructor name? Don't we need seperate Type Constructor and Data Constructor name variables?

TheMatten commented 3 years ago

@ssbothwell it's an optional field name (in case you use record syntax).

TheMatten commented 3 years ago

Everyone okay with merging? :slightly_smiling_face:

TheMatten commented 3 years ago

I'll take it as yes :slightly_smiling_face: