camfort / fortran-src

Fortran parsing and static analysis infrastructure
https://hackage.haskell.org/package/fortran-src
Other
48 stars 20 forks source link

Serialize internal Fortran AST representation to JSON #215

Closed uNouss closed 1 year ago

uNouss commented 2 years ago

Hello, is there a way to export the AST produced by fortran-src in a format like XML? I would like to use this tool to parse fortran77 and obtain an AST: this is done but in a format, a priori, ADT Haskell. As I have never coded in Haskell, I would like to know if there is a way to get this AST in XML instead.

For example, for this source: big_ints_real.f, fortran-src produces the following AST: big_ints_real.ast

raehik commented 2 years ago

We don't currently have support for serializing the parsed AST to an interchange format like XML. It's possible, but I think using the generated XML would be difficult. You would have to do a decent amount of parsing, since Fortran has a lot of syntax.

Could I ask how you want to use the AST? We could provide an XML exporter for parts of the syntax. But I'd be a bit wary of supporting full conversion to an interchange format, since it would mean supporting more things in our codebase.

uNouss commented 2 years ago

Thank you for your feedback, I would like to manipulate the AST in XML or JSON format. There are already upstream tools that allow you to work with these two formats but most of the parsers I have seen either do not produce AST or in a format that requires a lot of processing behind it like yours.

I understand that this may not meet your needs and could add unnecessary functionality to your codebase. I would have liked to do the export in an XML/JSON exchange format myself and propose it in a pull request but I am a beginner in Haskell.

raehik commented 2 years ago

Thanks! I was worried about both of those points. JSON in Haskell is much simpler and a lot less maintenance, and I've used it to make pretty JSON serializations for Haskell types before. That and your assurance has me much more confident we can provide something useful.

I brought this up at a meeting with some collaborators earlier, and apparently they have a tool that generates JSON for a subset of our AST. I'll start some initial work while we see if we could potentially complete and open source their existing tool.

Can I rename this issue to reference JSON instead? And I'll put any interesting updates here.

uNouss commented 2 years ago

Great, that is good to know. It would be great to have the AST in json. yes can rename it of course.

Thanks.

raehik commented 2 years ago

I have some work ongoing at https://github.com/camfort/fortran-src-aeson/ which aims at providing a decent JSON representation of the AST. Here's what the tool exports for your attached file: big_ints_real.f.yaml.txt (used YAML so it's a bit easier to scan through, but either means both).

Sum types are handled by having a tag field, then a contents field where the structure depends on what tag was (which constructor it's pointing to). So depending on context, you need to expect different values of tag and know what they mean for the structure of contents.

Naming isn't finalized. I usually drop repeated prefixes like St to give StWrite -> write, ExpFunctionCall -> function_call, but I wonder if retaining them would help clarity, because it helps in explaining what type the next node is.

We should also be able to parse correctly-structured JSON back into our AST type. If that's not useful, we might be able to make the serialization a bit better in exchange for reducing or dropping that support.

what do you think @uNouss ?

uNouss commented 2 years ago

Thank you for your feedback.

I have some work ongoing at https://github.com/camfort/fortran-src-aeson/ which aims at providing a decent JSON representation of the AST. Here's what the tool exports for your attached file: big_ints_real.f.yaml.txt (used YAML so it's a bit easier to scan through, but either means both).

The repository appears to be private. I was unable to gain access.

Sum types are handled by having a tag field, then a contents field where the structure depends on what tag was (which constructor it's pointing to). So depending on context, you need to expect different values of tag and know what they mean for the structure of contents.

It's a very good idea to add a tag. It allows you to know the type of the classes of the objects to manipulate. Very useful.

Naming isn't finalized. I usually drop repeated prefixes like St to give StWrite -> write, ExpFunctionCall -> function_call, but I wonder if retaining them would help clarity, because it helps in explaining what type the next node is.

About naming, it is also useful to keep the prefixes to make the link with AST.hs.

We should also be able to parse correctly-structured JSON back into our AST type. If that's not useful, we might be able to make the serialization a bit better in exchange for reducing or dropping that support.

I can't think of any case where I would need to use them. But if it is possible, it is always a good idea to have it.

By the way, thanks for the work and the implementation of this feature.

raehik commented 2 years ago

The repository appears to be private. I was unable to gain access.

Oops, thanks! It should be public now. If you're able to build it locally, you should be able to run it with stack like stack run -- -v 77e encode file.f

uNouss commented 2 years ago

Yes it is accessible now. Thanks for the quick response. I will test this.

uNouss commented 2 years ago

@raehik Hello I try to run with stack run -- -v 77e encode file.f but I get errors when it tries to clone fortran-src camfort/fortran-src-aeson#1

raehik commented 2 years ago

ping @uNouss , have you had any luck with the JSON serializer? (and did the fix I mentioned work?)

uNouss commented 2 years ago

@raehik Sorry for the late reply.

Yes indeed I was able to install it with the patch. It does allow me to get a yaml file like on the example you provided. Thanks a lot for the work. I have not been able to test it too much but I will not hesitate to give you detailed feedback as soon as I can. Thanks again.