LangProc / langproc-cw

Compiler coursework repository for Instruction Architectures and Compilers module at Imperial College London
19 stars 22 forks source link

Refactor compiler skeleton #22

Closed simon-staal closed 3 months ago

simon-staal commented 4 months ago

Changes:

Other considered changes:

simon-staal commented 4 months ago

To address your points:

#pragma once seems like a less-error prone concept, but StackOverflow is still divided - do you have some insights from industry? The companies I worked for still used classic header guards.

From my experience in industry, #pragma once is basically universally seen as a much better solution than the old school header guards. Although some of the comments in the thread you link claim that #pragma once has "unfixable bugs", I think that in practice you're much more likely to have issues with the old header guards, by accidentally re-using names, either by error or due to libraries you include. It's well-supported by basically all the big compilers you'd use.

Namespace names should be lowercase according to Google Style Guide that we are following.

sadge. will update

NodePtr usage is inconsistent - there are still many places that use Node*.

Not 100% sure I follow here. I'm pretty sure I use NodePtr everywhere in the AST files. If you're talking about the AST constructors / parser, unfortunately, flex / bison are pretty tied to a pre C++11-style API, and don't work well with unique_ptr as it tries to copy it around (which is illegal) so using raw pointers there, and then allowing your ast to own this memory by storing them internally as unique_ptrs is the best we can do imo.

New implementation of type_specifier is a bit advanced, but I do like the changes. Hopefully we can assume that people implementing more types than int would be proficient enough to understand the used constructs.

Given that they have to compile enums and switch statements, I'm sure we can trust them to figure out what an enum class is 😉. It's just a (safer) enum wrapped in a namespace after all.

I think type_specifier should still be split into .hpp and .cpp as currently all the implementation is in .hpp.

Given that type_specifier itself is just an enum class, and that the helper functions provided are relatively simple, I think it's fine (maybe even better) to leave it all in .hpp (the functions are constexpr, which solves any possible ODR issues). That said, if you have a particularly strong opinion, I can split it into a .cpp.

emplace_back needs a mention in the docs along with move semantics in general as it's certainly an advanced topic.

Can also touch on it, currently using it to ensure that NodeList builds without issues for both smart and raw pointers.

In terms of less intrusive changes to include, what are you thinking? I could add (in order of least to most intrusive):

Fiwo735 commented 4 months ago

Not 100% sure I follow here. I'm pretty sure I use NodePtr everywhere in the AST files. If you're talking about the AST constructors / parser, unfortunately, flex / bison are pretty tied to a pre C++11-style API, and don't work well with unique_ptr as it tries to copy it around (which is illegal) so using raw pointers there, and then allowing your ast to own this memory by storing them internally as unique_ptrs is the best we can do imo.

Oh, so that means that all of these have to stay as Node*? image

If so, then I think this is going to be too confusing for students: image

Is there any way around it? Like some conditional NodePtr hackery for flex/bison or extracting raw pointers from unique_ptr etc.?

Btw, I've noticed some places still have the previous type style (same goes for files in compiler_tests/ and debugging/, but I don't think it's that important to be consistent there): image

That said, if you have a particularly strong opinion, I can split it into a .cpp.

I think it'd be more consistent with existing files to split constexpr std::string_view ToString(TypeSpecifier type)

In terms of less intrusive changes to include, what are you thinking?

I'd say fixing memory leaks is definitely the most important to be merged into main. The other changes are less critical and they affect .cpp and .hpp, which I believe we should avoid modifying at this point to avoid confusion (while obviously pushing them to main_2024 and merging after the submissions as I do agree they're very valuable additions!).

simon-staal commented 4 months ago

Update: