blade-lang / blade

A modern general-purpose programming language focused on enterprise Web, IoT, and secure application development.
https://bladelang.com
Other
169 stars 11 forks source link

fix for segmentation fault #159

Closed apfeltee closed 1 year ago

apfeltee commented 1 year ago

The segmentation fault in free_objects() occurs due to b_obj_string fields being errorneously freed twice, which results in those objects becoming invalid.

The fields in question are name in b_obj_closure, name in b_obj_func, name in b_obj_class and mode & path in b_obj_file.

Additionally, leaks occur whenever copy_string() is called with an empty string, as well as unnecessary b_obj_string creation in io_module_(stdin|stdout|stderr) for the field mode (which is already created during call to new_file).

After fixing that, there should be no longer a segmentation fault, although there are approximately 10 blocks that I suspect are created during calls to strdup.

You can find most of them with the regex '(->|.)\b(name|mode|path)\b', i.e. with grep.

mcfriend99 commented 1 year ago

Thanks for the report @apfeltee. I'll look into this now and confirm them. Did you happen to fix it in your build?? If you already fixed them, a PR will really be appreciated.

apfeltee commented 1 year ago

i did fix it in my build, but my programming style is too different from yours to make a PR viable.
Just to confirm, I did a local diff, and it weighs in at ~7000 lines difference. Mostly because I've already used clang-tidy and clang-format.

I understand this makes it more difficult, and I apologize. But the changes are really trivial; you just need to remove the offending lines in free_object() respectively.

mcfriend99 commented 1 year ago

Additionally, leaks occur whenever copy_string() is called with an empty string

Can you kindly point me to such instance and how you changed it to fix it??

After fixing that, there should be no longer a segmentation fault, although there are approximately 10 blocks that I suspect are created during calls to strdup.

Care to expand better??

apfeltee commented 1 year ago

In the case of io_module_*, I merely added the actual (virtual) file modes, so "rb" for stdin, and "wb" for stdout/stderr.

Care to expand better??

Wish I could provide better information, but your best bet is to use valgrind: valgrind --leak-check=full ./bin .... It's how I track down memory issues.

apfeltee commented 1 year ago

I do think I've figured out how to get rid of most of the remaining blocks (the exact number is dependent on the number of modules are are being declared and loaded):

In the function load_module, before CLEAR_GC, simply free the string created by strdup for new_module.

I'm actually surprised how trivial that one is, because was worried that freeing it might render the object invalid, but that does not appear to be the case. My spontaneus guess is that the string (the one for new_module) is copied to a b_obj_string at some point.

But valgrind shows all clear, which is great! Hope any of this helps.

mcfriend99 commented 1 year ago

I'm appalled that I missed that. You're absolutely right. Because add_native_module will create a copy of the name, the strdup was completely unnecessary. Thanks @apfeltee

apfeltee commented 1 year ago

so, my changes are very extensive.
I fully realize that this effectively makes my fork incompatible with what blade is (I assume) about.
Nevertheless, you might be interested.

so far I've made these changes:

link to the repo: https://github.com/apfeltee/blade-temp

take a look, maybe incorporate ideas and/or stuff you like. i don't make any money off of this, i just do this as a hobby.

[1]: using actual classes, something that Lox had provided, means that things like type checking, monkeypatching, and some other things, are possible. using only a hashtable is really inefficient.

benstigsen commented 1 year ago

Some changes that I like:

Internal primitive types (array (formerly list), string, bytes, etc) are now proper classes, that derived from Object (named "Object"). This also declares "String", "Array", et cetera.[1]

rewrote bits of the VM that deal with mathematic operations to not use macros, since assumptions about, for example shift, cannot be made that broadly

fixed a lot of warnings, and memory inconsistencies

replaced the gung-ho hash function with xxhash, which performs much, much better

The others about syntax are subjective. I do like how your changes are more common with other languages, but I do not think this is needed for Blade.

mcfriend99 commented 1 year ago

I'm really interested in the lot of warnings, and memory inconsistencies. Would you be nice to shed more light??

mcfriend99 commented 1 year ago

I've tried xxHash in the past and the couldn't see any performance improvement so it didn't feel like it's worth replacing the simple FNV1a hash for a complicated hash like that. I've even tried Siphash and Tomhash sometimes in the past, but performance gains were next to noise.

What performance gains did you notice?? Can you point to a testable sample??

apfeltee commented 1 year ago

Would you be nice to shed more light??

Have you tried compiling with -Wall -Wextra -Wshadow -Wunused-macros -Wunused-local-typedefs?

What performance gains did you notice?

Specifically, xxhash has consistent performance, both for small and large strings. That is something that will start to matter as soon as you're dealing with lots of strings, both big and small.

But frankly, it's really the most minor change in all this. I'm currently focused on making objects actual objects, which would come with a myriad of nice reflection abilities (similar to what JS does, but also inspired by Ruby & Python).

apfeltee commented 1 year ago

The others about syntax are subjective. I do like how your changes are more common with other languages, but I do not think this is needed for Blade.

Well, the way I see it, Javascript is pretty much everywhere, so a basic compatibility, language-wise, is sensible. That way there is less to learn.

Familiarity can really help boost the language, and would make it more comfortable to write in.
But I'm not gunning for a full on Javascript compatibility, but only familiarity.

benstigsen commented 1 year ago

I think the way it is now is fine. It's pretty similar to Python, but with curly braces.