elm-tooling / gsoc-projects

A list of possible gsoc projects
4 stars 4 forks source link

Student Proposal: Implementing a Semantic parser for Elm #16

Open ValerianClerc opened 3 years ago

ValerianClerc commented 3 years ago

Student Proposal: Implementing a Semantic parser for Elm

Edits: prioritize step 4 alongside step 3, use TDD

Name: Valerian Clerc Email: valerian.clerc@gmail.com Slack nickname: vclerc Potential mentor: TBD

Summary

Following this project idea from the GSOC project page, I'd like to tackle the addition of a Semantic parser for the Elm language. Semantic is a technology supported by Github which powers their code navigation. Adding Semantic support for Elm involves building a Haskell library that works with tree-sitter's output to power enhanced language support on Github (and potentially for other dev tools in the future). As described in the Semantic documentation, adding a semantic parser is a long procedure, but it's broken down into distinct and modular steps, described below.

What will the project focus on

These are the complete steps for adding a new language to Semantic:

  1. Write a tree-sitter parser for the language (already mostly done!).
  2. Create a Haskell library providing an interface to that C source code outputted by tree-sitter.
  3. Create a Haskell library in Semantic to auto-generate ASTs.
  4. Add tests for precise ASTs, tagging and graphing, and evaluating code written in the language.

How will I achieve this

Benefits

Semantic and tree-sitter-elm are impactful projects because they power features available to developers, namely code navigation on Github. Github code navigation allows users to understand code quickly and hassle-free by letting us click on variables or functions and see the declaration or other references to that identifier. Semantic is in active development, and Github promises that this is just "scratching the surface" of the project's possibilities! Most major languages have a Semantic package implemented, so creating one for Elm will enable the same level of features and support that giants like Python/Java have.

Timeline

Weeks of May 17th - 31st:

Community Bonding, meeting mentors and Elm community members. Setting/refining goals and expectations for the summer, and familiarizing myself with the Elm ecosystem. Scope out steps 1 and 2 in more rigorous detail. Read documentation! Start playing around with tree-parser, and define what work still needs to be done on it.

Weeks of June 7th - July 5th:

Heads-down working on this project. Ideally I will finish steps 1 and 2 during this period, and start working on 3 and 4 (this pace will probably depend on how much work remains to be done on tree-parser before moving on).

Week of July 12th:

First evaluation, check in with goals that we set during the first weeks, and reflect on what's working well and what could be changed. Set realistic goals for the rest of summer.

Weeks of July 19th - August 16th:

Code some more! Ideally will be working on steps 3 + 4 at this point, and have it wrapped up and well documented by August 16th. I will be moving across the country in early August, so I'm hoping to put in extra hours in the first half of summer to make up for the distraction and chaos of moving.

Goals

The goal of this project is to progress the development of a Semantic parser for Elm. Conveniently, this project is broken down into discrete steps (1-4). So, even if I don't finish the project in its entirety, others can continue to build on top of the steps that I have completed. Considering that step 1 is mostly completed, and step 2 is a short PR, my main goal will be implementing steps 3 and 4: building a Haskell AST generator for Elm, and generating tests for this AST generator.

Requirements

Thanks for reading my proposal, and I'm happy to take any feedback or input :)

-Valerian