Closed rvalek closed 1 year ago
Hi @rvalek , sorry, just saw the issue! I will dive in and see how complex is it to fix. Thanks for bringing it up!
Hi @rvalek , can you please try ti2023 branch to see if it helps with the issue?
git clone -b ti2023 https://github.com/evgskv/logica
It does look like there might be some kind of race condition in the way main
is written, but it's hard to be sure.
Please let me know if the fix is works or not.
PS: It should also be easy to manually patch in the change into your fork if needed: https://github.com/EvgSkv/logica/commit/7164c91b96f8d29264d9e6aa049b0937610b4673
Hey @EvgSkv! Thank you for looking into it! We applied your suggested change to try it out. I'll let you know whether the error is gone by the end of the week (it was occurring pretty infrequently, 5-10 times a week).
@EvgSkv sorry to report that the errors are still happening. We noticed a few within an hour of applying your change :\
The issue seems to be elsewhere. I shared my meager efforts at analysis it in my original message, in case it makes sense.
Hi @rvalek it does look to me that there was a possibility of the dictionary being modified while copying, even though I can't reproduce it myself. And as I was trying to debug it further I understood that my previous fix had a bug and wasn't really a fix at all. I've pushed a new version to the same branch (https://github.com/EvgSkv/logica/commit/ec506cd7088fec575c8e007f7cc136fd663f08b4)
Can you please try patching that and let me know if it helps?
Apologies for the bug, I am sure it won't take too many iterations to have it eliminated. And thank you for your help! Evgeny.
Hey @EvgSkv! Thanks for this solution, we tested for a week and it works for us :)
Great to hear that @KantorSerhiy ! 😄
thanks again for helping out, @EvgSkv ! closing the issue
Hi @EvgSkv!
Problem statement:
I am noticing a rather infrequent crash during compilation:
(the line numbers might be a bit shifted because I caught this on a slightly outdated compiler version -- this shouldn't relevant here).
This is coming from inside the builtin
copy.deepcopy
, and indicates that the number of keys of the dict inself.bulk_functions
was modified while it was being copied. I am getting this in a python process that may concurrently run several compilation (LogicaProgram(parsed_logica, user_flags=flags).FormattedPredicateSql(...)
) threads.So, the guess is that one thread crashes during
copy.deepcopy
while another is somehow modified the same dict.Sentiment:
I know that compiler thread-safety is nowhere guaranteed, so I would totally understand if you chose to disregard issue.
That said, I chose to post here because I am a little dumbfounded: I see this error periodically happening on this threaded service, but I can't reproduce it directly. I am also out of ideas for now as to what causes it, because going through the code I reach the following conclusions:
self.bulk_functions
that is simultaneously being copied in another compilation in thread 1, then the thread 2 is at a stage of compilation that is before or after making it's owncopy.deepcopy
.bulk_functions
dict BEFORE making a deep copy must've already been done, inside the thread 1 compilation (because it has reachedcopy.deepcopy
and crashed), and looking at the code I see that oncebulk_functions
is populated (prior to copying), it will not be changed.So, I can't even put a finger or a specific place where thread 2 might be mutating
bulk_functions
during it's own compilation to cause a crash in thread 1's compilation atcopy.deepcopy
.What am I missing? Do you have an idea where I should be looking? Does the reasoning make sense? Is the whole direction wrong?
P.S.
While analysing the above, I was trying to understand the relevance of
CleanDictionary
, as I see it happens sometimes, but I never actually reach deletion of any keys when I try to go step-by-step with the debugger. Either way, it shouldn't be related to the above issue becauseCleanDictionary
operates on the deepcopy after it has been made, so it can't touch the original dict, right?Any ideas on this subject would be appreciated! Sorry for a tl;dr...