hsutter / cppfront

A personal experimental C++ Syntax 2 -> Syntax 1 compiler
Other
5.23k stars 224 forks source link

[SUGGESTION] Simplifying language (Everyday problems) #1085

Open Parad0x84 opened 1 month ago

Parad0x84 commented 1 month ago

Disclaimer (and answer to the questions in the suggestion template): First of all, what I'm about to suggest is not necessarily about eliminating security vulnerabilities or guidance literature (but it might be), It's more about adressing some problems where you deal frequently and simplifying how you deal with them. Also I present this idea only as an opportunity to explore possibilities (so I did not consider alternatives, since the suggested feature is pretty much rare. At least I'm not aware of more examples) and I think something like this is pretty much possible within cppfront.

There is a few very common problems you deal frequently. Namely: 1-) Custom memory allocator 2-) Logging 3-) Dealing with libraries (this one is more like a side product of first two)

If somehow we can simplify these problems at once it would be great in my opinion. So how we might do that? We could centralize how you do these 2 operations (possibly even more). After briefly explaining what I have in mind, I think it would be easier to show another programming language (not fully released...): Jai programming language (by Jonathan Blow) Jai has that neat feature called "Context" and I think we might be able to apply something similar to cppfront or at least consider some alternatives. Without making this post any longer, here is some resources about what is this feature and what is Jai: https://medium.com/@christoffer_99666/a-little-context-d06dfdec79a3 https://github.com/Ivo-Balbaert/The_Way_to_Jai/blob/main/book/25_Context.md https://www.forrestthewoods.com/blog/learning-jai-via-advent-of-code/#context

hsutter commented 1 month ago

Thanks for the questions!

Re (1), a partial answer is that cppfront does all memory allocation using named arena objects (aka allocators). If you mean the standard allocators, I don't have any ideas on making them work better, but cppfront fully supports them as they are today.

Re (2), there are quite a few good C++ logging libraries (including available via Conan and vcpkg package managers), and they should all work perfectly with cppfront. I hear very good things about spdlog. Is there a reason these aren't suitable for what you want to do?

Re (3), which I agree is related, are you looking for something like a package manager? Conan and vcpkg are quite good, and work fine with all C++ code including C++ code written using Cpp2/cppfront.

Parad0x84 commented 1 month ago

Thanks for the questions!

Re (1), a partial answer is that cppfront does all memory allocation using named arena objects (aka allocators). If you mean the standard allocators, I don't have any ideas on making them work better, but cppfront fully supports them as they are today.

Re (2), there are quite a few good C++ logging libraries (including available via Conan and vcpkg package managers), and they should all work perfectly with cppfront. I hear very good things about spdlog. Is there a reason these aren't suitable for what you want to do?

Re (3), which I agree is related, are you looking for something like a package manager? Conan and vcpkg are quite good, and work fine with all C++ code including C++ code written using Cpp2/cppfront.

Hi, I'm sorry for ambiguity. I thought provided links would be enough to describe what I'm talking about. The idea is you don't pass an allocator to everything and expect them to use it. You just create a global-ish variable which contains your allocator (you can also change it later on, or just use another one for a small scope). everything automatically uses your allocator instead asking to OS. Same idea for logging. You provide some formatting in the same global-ish variable and everything that logs something, uses your message formatting. (for example a library could have "[Library A]: ") prefix before any message to the console. About libraries: if you can do these two, you automatically have a better control on what a library does.

I know it sounds like a lot of work and I'm not saying "it's not or it's worth it". I'm just saying "I believe we can achieve something like that". So question is "Is that worth doing?". Even if it's not, It's just an idea. I'm just presenting it to see, if something like that would be useful to this project since goals seems to match

DyXel commented 1 month ago

Interesting, this Jai programming languages seem to be tackling the complexity problem but from a different angle. Good to see lots of people experimenting on this part! Is there a way we could access a compiler for it to test things out? I didn't see any public release anywhere...

Onto the questions themselves:

For the 3rd, I think we all wish to simplify the language while keeping it expressive and its surroundings as well, but currently dealing packaging or libraries doesn't seem to be on the main road, as Herb explained, there are plenty of good tools out there that already solve the job, even if they are not as friction-less as what a complete "suite" for the language would provide. Having a smaller, simpler language tackles the issue indirectly though, perhaps we could see a more integrated "suite" (smaller/faster compiler, packaging, ide, etc.) if cppfront truly takes off.

For the 1st and 2nd, currently this context feature as described in the linked posts is not really implemented, same with logging, but I think I get it: Having a common piece of code that doesn't know about allocation or logging but that can still be customized without having to change all the code (and thus, reduce friction from refactoring). I believe metafunctions could be tailored to help with these cases, you could have one that changes the arena objects or one that changes the logging functions called.

Parad0x84 commented 1 month ago

Interesting, this Jai programming languages seem to be tackling the complexity problem but from a different angle. Good to see lots of people experimenting on this part! Is there a way we could access a compiler for it to test things out? I didn't see any public release anywhere...

Onto the questions themselves:

For the 3rd, I think we all wish to simplify the language while keeping it expressive and its surroundings as well, but currently dealing packaging or libraries doesn't seem to be on the main road, as Herb explained, there are plenty of good tools out there that already solve the job, even if they are not as friction-less as what a complete "suite" for the language would provide. Having a smaller, simpler language tackles the issue indirectly though, perhaps we could see a more integrated "suite" (smaller/faster compiler, packaging, ide, etc.) if cppfront truly takes off.

For the 1st and 2nd, currently this context feature as described in the linked posts is not really implemented, same with logging, but I think I get it: Having a common piece of code that doesn't know about allocation or logging but that can still be customized without having to change all the code (and thus, reduce friction from refactoring). I believe metafunctions could be tailored to help with these cases, you could have one that changes the arena objects or one that changes the logging functions called.

Jai is at closed beta I think. So there is no compiler you can work with, but there is content creators which have access to it. So if you are interested, you can find some written posts and videos.

I want to clarify something: I don't wanna be rude, but I don't know where did you get the idea I'm asking 3 different questions(It's probably my mistake). I'm just presenting an idea and 2 problems it can help with (3rd one is a side product). Again I'm sorry, if this came off rude. I'm just trying to clear a misunderstanding

DyXel commented 1 month ago

No worries, now that I look closely again, yeah, you just pointed at several pain problems, they weren't specifically questions. It's just that its hard to come up with specific solutions to a broad term such as "everyday problems", which is why I specifically pointed to what you posted since I couldn't possibly answer super broadly. Friction and difficulties are different for each one of us, and from what I understood, Jai is trying to deal with problems that game engine developers face every day. Solutions and designs proposed by cppfront might as well not 100% align to these, but we can try to do our best.

Parad0x84 commented 1 month ago

No worries, now that I look closely again, yeah, you just pointed at several pain problems, they weren't specifically questions. It's just that its hard to come up with specific solutions to a broad term such as "everyday problems", which is why I specifically pointed to what you posted since I couldn't possibly answer super broadly. Friction and difficulties are different for each one of us, and from what I understood, Jai is trying to deal with problems that game engine developers face every day. Solutions and designs proposed by cppfront might as well not 100% align to these, but we can try to do our best.

Yeah, I don't know if I could come up with a better title... I get what you are saying. It's targeting a specific group to solve their problems, but I think it should mostly align with general C++ development. Not 100% of course, but mostly. About specific subject "Context feature", it's not specific to the game development.

And again it's not a problem which personally I'm having issues and trying to find a solution. It's just some neat feature I've seen somebody else do and I think it's applicable to cppfront as well. (have similar goals) If not, that's fine. I'm just presenting it to see, if it's something we might need or not. Thanks :)

Note: I'd also like to see what other people think, not just maintainers

hsutter commented 1 month ago

The idea is you don't pass an allocator to everything and expect them to use it. You just create a global-ish variable which contains your allocator (you can also change it later on, or just use another one for a small scope). everything automatically uses your allocator instead asking to OS. Same idea for logging. You provide some formatting in the same global-ish variable and everything that logs something, uses your message formatting. (for example a library could have "[Library A]: ") prefix before any message to the console.

Ah, thanks, now I understand better! This is the approach for contracts in Cpp2, everyone uses the same contracts facility and can do things like share a violation handler. I'll keep allocators and logging in mind.

For logging it seems it would be easier: We could provide a common logging facility that everyone can be encouraged to use, as with contracts. Likely it could be an ordinary library in cpp2util.h.

For allocators, I think someone needs to sketch a proposal for use cases and how they could be made to work, and how compatible that design is with current C++ code. If anyone has an implementation idea in mind, please share! That said, here's a sketch of one idea, which has an open question/tradeoff or two:

/brain-dump

Parad0x84 commented 1 month ago

I was thinking something like detecting calls to the "new" (and memory allocation of library types like shared_ptr, etc) when we run code through compiler and do the necessary operations there including libraries (obviously for libraries we have the source code), but after seeing your answer, I understand this is not very realistic. But if we can apply something like that to the new code which cooperates on that, it still would be a great benefit in my opinion. We are still very early on cppfront, so I think we can afford this kinda changes for future So maybe we could make it an opt-in/opt-out of some kind? (if we decide to progress with this)

For allocators, I think someone needs to sketch a proposal for use cases and how they could be made to work, and how compatible that design is with current C++ code. If anyone has an implementation idea in mind, please share!

By the way, can you clarify proposal? Does cppfront have a proposal system of some kind or are we talking about standard C++ proposals?

rsashka commented 1 month ago

Sorry to barge into the conversation, but I have an idea for allocators that is completely backwards compatible with current C++ implementations, but provides full reference control without runtime overhead, i.e., the entire memory management mechanism works at compile time.

The idea is to create a separate syntax for defining and working with "managed" variables, so that proper memory handling is built into the lexical rules for handling such variables, and their internal implementation is based on existing elements (shared_ptr, weak_ptr and unique_ptr).

If it is possible to create syntax rules for working with controlled variables, then as an example Possible solution to the problem of references in programming languages: r/ProgrammingLanguages (but this article does not take into account working with multi-threaded links), then such a solution can be used not only in the specialized language cppfront or NewLang, but may also be implemented in the future in one of the C++ standards, C++2c or C++3c.

hsutter commented 1 month ago

For allocators, I think someone needs to sketch a proposal for use cases and how they could be made to work, and how compatible that design is with current C++ code. If anyone has an implementation idea in mind, please share!

By the way, can you clarify proposal? Does cppfront have a proposal system of some kind

Sorry, I was unclear! I meant a proposal here -- could be just a detailed comment in this thread, or a new Suggestion thread with a detailed design that could be looked at.

hsutter commented 1 month ago

The idea is to create a separate syntax for defining and working with "managed" variables, so that proper memory handling is built into the lexical rules for handling such variables, and their internal implementation is based on existing elements (shared_ptr, weak_ptr and unique_ptr).

Is this similar to Rust-style borrow checking? Or perhaps to Verona which was pursuing essentially bounded memory regions with a single point of access to the objects in a region?

rsashka commented 1 month ago

The target description is very similar to Verona, but I don't know its specifics, but most likely no.

This is very similar to borrowing from Rust only at the idea level, but it is not the same. Rust is controlled only by the owner of the object, and the idea is a full semantic control references, including indicating the possibility of their use from different application threads (built into compiler the multi-threaded access synchronization)

hsutter commented 1 month ago

At that level of detail it sounds a lot like Verona. The basic idea in Verona is that each isolation region was a group of objects that you had to access via a single "root" entry point, and that was the chokepoint that let you give language (they had their own language extensions) guarantees on single-mutable-use and rendezvous and other synchronization-related effects like those. (It was incredibly detailed to show how/that it all worked, what primitives and strategies were involved, that's just a very handwavy summary from memory.)

rsashka commented 1 month ago

Now, this is very similar to Verona, when a variable becomes a single entry point for access of region (or a group of objects) can be specified semantically as access synchronization points.

P.S. Thank you very much for the link to the Verona project!

Parad0x84 commented 1 month ago

@rsashka I'm not sure what exactly you are suggesting, but as far as I understand from following explanation, it's something like pointer to a wrapping object just like unique_ptr*, but it has some special behaviour.

At that level of detail it sounds a lot like Verona. The basic idea in Verona is that each isolation region was a group of objects that you had to access via a single "root" entry point, and that was the chokepoint that let you give language (they had their own language extensions) guarantees on single-mutable-use and rendezvous and other synchronization-related effects like those. (It was incredibly detailed to show how/that it all worked, what primitives and strategies were involved, that's just a very handwavy summary from memory.)

Also I'm not sure where to put this "managed" variables when we are also preparing a garbage collected opt-in (as far as I'm aware) Also I don't know what this has to do with this post

rsashka commented 1 month ago

Thank you! Then I make a separate proposal with a full description, so as not to interfere with the discussion of this post.

Parad0x84 commented 1 month ago

Thank you! Then I make a separate proposal with a full description, so as not to interfere with the discussion of this post.

I don't know, if that was rude. If it was I'm sorry. It just sounds like unrelated. Regardless, I would like to hear more about it. It sounds interesting, but I don't think I understand it. So if you could drop a link when you create another post, I would appreciate it

Parad0x84 commented 1 month ago

here's a sketch of one idea, which has an open question/tradeoff or two:

  • The default allocation in Cpp2 is already wrapped in unique.new and shared.new, and unique_ptr and shared_ptr can carry deleters. So in principle we already have a point (those wrappers) where we could inject support an ambient allocator, whereby calls to unique.new and shared.new use the ambient allocator and do the right thing via a deleter. However, the glitch in that plan is that for unique_ptr the deleter is part of the type... so doing this would change the return type of unique.new to something different from unique_ptr. One way to keep the type stable is for unique.new to always return a unique_ptr<T, cpp2::deleter> that does erasure, but besides any overhead with that, I worry that it too would not be the same type as unique_ptr and we would lose that type compatibility. Also, it would only help allocations made via unique.new and shared.new... any allocations made in existing code using make_unique and make_shared wouldn't get ambient allocators, only Cpp2 code that uses unique.new and shared.new would. So those are some open questions/tradeoffs with using the standard smart pointers. And I think using the standard smart pointers is desirable, if we can avoid telling users to use some other-and-different smart pointer system.

Can we talk about why we wanna keep consistent with unique_ptr<T> or even have a consistency at all? Like why can't we have unique_ptr<T, X> where X don't have type erasure? Is it because we wanna interface over something like unique_ptr<T>& ?

Xeverous commented 1 week ago

Here is my take on these issues.

For context, I'm a passionate C++ programmer that through my headhunting company ("programmer as a service" business model) work with many kinds of various C++ projects, often coming from 3rd (or 4th) party partners. In simpler words, I work with C+- code bases where "guidelines" like "don't use std::chrono, uint64_t is much simpler and shorter" and "#define is as good as constexpr" are the norm. Knowledge and competence is very scarce in this area and if the language doesn't force people to do something beneficial, they will never use it - just like my current C++17 project had exactly 0 [[nodiscard]] when I joined. For these reasons I love every idea (like required discard) that makes bad code impossible or at least uncomfortable to write.

1) Custom memory allocator

This could not exist at all and (at least where I work) no one would notice. Some people prefer to use and deep copy std::list because it has a nice .sort() available. Jokes aside, I think this might be an underused standard library feature and the status quo with strange rebound and allocator-is-only-for-T isn't helping either. If Cpp2 wants to offer some specific customization point here, it would be really useful if there were some built-in allocators to support people who know enough that custom allocation is a thing but aren't savvy enough to write their own implementation.

In other words, C++ is and still can be a very powerful language but it lacks the "batteries included" part a lot.

Zig has an interesting allocation customization but I think having it specified for every function is too verbose.

2) Logging

I see 2 possibilities here:

In practice, almost every project I worked on had the printf-macro approach because in majority of cases people want to just print built-in types coupled with __FILE__, __LINE__ and __func__. I think the best possible "everyday problem simplification" here would be to be able to use fmt for logging but with custom sink - that is, you decide where the generated string goes, without the need to write additional templates or other complex code.

3) Dealing with libraries

Where I work I often need to integrate libraries with different build systems and more often than not it ends up as some abomination of mixed CMake, Bash, Python and build-time code generation. Often with dysfunctional incremental builds.

The language will likely never support every use case but I find it strange that there is no official format of a "C/C++ project tree and build info". The closest thing we have is CMake's target sources/includes/defines/links commands but that still allows a ton of misuse. C/C++ build system is so complex that I frequently encounter "try random stuff until it builds" approaches because people don't find it time worthy to learn all the theory and tools.

Again, I think the best C++ can do is to include more batteries. People still get a ton of "undefined reference" errors and need to visit stack overflow to learn about it because the compiler won't or can't give an explanation. Some ODR violations still result in "real" undefined behavior, that is, seemingly functioning builds but with random crashes.


Sorry for a bit rant-style comment but these are my everyday problems. "C/C++" or "C+-" is very common in my place and even in 2024 I see people professionally writing new code with #define CONSTANT (123u) - code review has little sense when pointing out such mistakes is being ignored. I just wish C++ was much less permissive of unidiomatic code.