Closed AuzFox closed 6 years ago
Hi, and thank you for submitting your issue.
I've tried to manage your problems with commit 3a338e5, can you please check it our for your use-case? UNICC_OUTOFMEM
is now declared as a macro that gets pcb
as a parameter in all cases. I corrected all occurences of UNICC_OUTOFMEM where pcb
was not present. Furthermore, you can stop the parser by setting pcb->act = UNICC_ERROR;
in case you want to stop parser, likewise YYERROR should do it. Please check out if this is working for you.
For your third point, would it be helpful to hold a variable in the pcb
that gets TRUE in case the pcb was allocated by @@prefix_parse()
?
Again, I hope I don't come off as rude or demanding! If there is a solution to these problems already, I apologize!
Its never rude to help an open source project become better, so thanks again for your request! Any help is always welcome.
Its never rude to help an open source project become better, so thanks again for your request! Any help is always welcome.
That's good! I guess I just get nervous talking to people, especially online, so I tend to apologize a lot, haha.
At any rate, I've downloaded the updated version and everything compiles and runs successfully. A simple test conditionally setting pcb -> act = UNICC_ERROR;
correctly stopped the parser, so that seems to work great! One question I have is: will this work in lexer terminal actions as well? I don't need this in my project, I'm just wondering.
I also did a simple test on UNICC_OUTOFMEM
by manually changing the UniCC-generated .c file to always fail a memory allocation; the call to my cleanup()
function worked as intended. This is what my cleanup()
function looks like (I changed the name too):
void @@prefix_cleanup(@@prefix_pcb* pcb)
{
// @@prefix_parse() calls UNICC_OUTOFMEM
// if it fails to malloc() the pcb structure,
// in this case, pcb will be NULL
if (!pcb)
{
return;
}
// clean up pcb internals
free(pcb -> buf);
free(pcb -> stack);
#if UNICC_UTF8
free(pcb -> lexem);
#endif
#if UNICC_SYNTAXTREE
@@prefix_syntree_free(pcb -> syntax_tree);
#endif
@@prefix_ast_free(pcb -> ast);
// clean up custom pcb contents
fclose(pcb -> infile);
// clean up pcb struct itself
// (if allocated with malloc(), @@prefix_parse() passed NULL)
//if (pcb -> allocated) // insert whatever actual name is given to this variable
//{
// free(pcb);
//}
printf("called cleanup()!\n");
}
The final if statement is obviously commented out because it won't work yet. Speaking of which, I think your idea of giving pcb
a variable to check if allocated by @@prefix_parse()
would be a good idea. It would stop pcb
from being leaked if allocated by @@prefix_parse()
. My project never calls @@prefix_parse(NULL);
, but it makes sense to provide this for anyone who might need it.
Thank you for adding this and helping me so quickly!
In my project, I plan on adding an import
statement to my grammar that would call @@prefix_parse()
recursively on other files. I want to check for errors and avoid leaking any memory, so UNICC_OUTOFMEM
must not call exit()
in this case. I took a look at the template files and I believe I found some problems that might occur, as well as some simple memory management errors:
in @@prefix_syntree_append()
:
node
is leaked if:
token
!= NULL
and strdup()
fails.
token
!= NULL
and wcsdup()
fails.
if allocation of node
fails and UNICC_OUTOFMEM
does not call exit()
:
memset()
on node
will cause a SEGFAULT
and probably crash the parser.
in @@prefix_alloc_stack()
:
pcb -> stack
is leaked if:
realloc()
on the stack fails. (pcb -> stack
is overwritten with NULL
and the old pointer is lost without freeing.)
in @@prefix_get_input()
:
pcb -> buf
is leaked if:
realloc()
on buf
fails. (pcb -> buf
is overwritten with NULL
and the old pointer is lost without freeing.)
The leaks in @@prefix_alloc_stack()
and @@prefix_get_input()
can be fixed by using a temporary variable. Something like this:
// example variable declarations:
void* main_pointer = &some_memory;
void* temp_pointer;
...
// when re-allocating:
temp_pointer = realloc(main_pointer, new_size);
if (!temp_pointer)
{
// main_pointer needs to be freed, it is not freed by realloc()
free(main_pointer);
// handle rest of error here (return error code, etc)
}
// realloc() was successful:
main_pointer = temp_pointer;
If you don't save the main pointer before the call to realloc()
, you won't be able to free it if realloc()
fails. You probably already knew this, but if not, it's good to know!
There might be other leaks and errors, but these are the ones I found.
Additionally, I don't think all instances of memory allocation are checked for errors and handled appropriately in functions like @@prefix_parse()
. This makes sense when UNICC_OUTOFMEM
calls exit()
, but might cause problems in the parser if exit()
isn't called and memory allocation fails. I haven't completely studied the whole parser algorithm so I might be wrong about this though.
Thanks again!
Hey there, and thanks for your further suggestions. I've integrated them all into the latest commit 052df85, you may check it out. All improvements where integrated both into the C and C++ targets.
The UNICC_SYNTAXTREE feature has been completely removed now from the C target (it was already deprecated and can easily be replaced by the AST construction feature).
One question I have is: will this work in lexer terminal actions as well? I don't need this in my project, I'm just wondering.
It should do so, but I haven't tested it yet.
If you don't save the main pointer before the call to realloc(), you won't be able to free it if realloc() fails. You probably already knew this, but if not, it's good to know!
Yep that was known to me, but I developed this parser many years ago and didn't focus at this time on the case what happens when the heap runs out of memory. But its good to integrate your recommendations, better now then never.
So long, Jan
The final if statement is obviously commented out because it won't work yet. Speaking of which, I think your idea of giving pcb a variable to check if allocated by @@prefix_parse() would be a good idea.
You can check via pcb->is_internal
now whether the pcb was allocated by @@prefix_parse() or from outside. I've recently fixed this with 67cad67 to always be in a valid state.
Yep that was known to me, but I developed this parser many years ago and didn't focus at this time on the case what happens when the heap runs out of memory. But its good to integrate your recommendations, better now then never.
I see. That makes sense. I know I've said it a few times already, but thanks for taking the time to implement these things. \:3
I hope I'm not being annoying with all these fix requests! I know you're probably busy with other things besides working on UniCC. I found 2 more problems worth pointing out. Besides these though, there doesn't seem to be any other memory-related issues besides these, so I'll leave you alone after this. \:P
@@prefix_ast_create()
doesn't check if the token
passed to it from @@prefix_lexem()
is NULL
because of an allocation failure.
@@prefix_parse()
doesn't check calls to @@prefix_ast_create()
for errors.
Hey @AuzFox,
@@prefix_ast_create()
doesn't check if thetoken
passed to it from@@prefix_lexem()
isNULL
because of an allocation failure.
That's not correct: @@prefix_ast_create() can also be called with NULL as token parameter and handles this correctly, so no token will be copied into the AST node.
@@prefix_parse()
doesn't check calls to@@prefix_ast_create()
for errors.
That's right, and now hopefully fixed with 453a853 :+1:
I hope I'm not being annoying with all these fix requests! I know you're probably busy with other things besides working on UniCC. I found 2 more problems worth pointing out. Besides these though, there doesn't seem to be any other memory-related issues besides these, so I'll leave you alone after this. :P
I'm very thankful for any help making UniCC better and more stable. If you like and use it, its an honor for me. Sure I'm also busy with other things, but UniCC is a project I totally do for fun, and it's a lot of fun when people like you are using and improving it. So please don't leave me alone, and help to make it better :+1:
So long, Jan
Hi @AuzFox, can I close this issue for now?
Sorry, I have been busy with things and forgot to check back here. Yes, this issue can be closed.
(Sorry for the long message! I wanted to explain everything thoroughly, so the text just kept growing!)
Hello! I've been using UniCC to write a compiler for a programming language after getting fed up with flex/bison's clunky interface. I must say, UniCC is much easier to work with! However, I believe there is a problem with the way UniCC manages its memory, specifically when memory allocation fails.
The default definition of
UNICC_OUTOFMEM
simply prints an error message and callsexit()
, if I am correct, this leaks all memory allocated for thepcb
struct. (potentially including the struct itself if@@prefix_parse()
is passedNULL
) Additionally, callingexit()
immediately prevents me from cleaning up any memory allocated or files opened outside of the@@prefix_parse()
call.Since
UNICC_OUTOFMEM
can be redefined, I tried putting pointers to all the data I needed to free in thepcb
struct and redefinedUNICC_OUTOFMEM
to passpcb
to a function to that frees everything before callingexit()
. This caused problems because:UNICC_OUTOFMEM
is used in lots of different places, andpcb
usually expands correctly. But in a few places (like the ast-related functions) there is nopcb
reference, and the macro expansion causes compile errors.typically, the
#prologue
code is inserted before thepcb
struct definition. In order to try and set up everything "properly" I wrote something like this in my .par file:epilogue
[* ...
*]