Open ELePors opened 3 months ago
Thanks for sharing the workarounds!
Thanks for the report.
The problem is that FFI memorizes the path of the library inside the compileMethod... the CompiledMethod instance of the ffiCall (for example SDL2 class>>modState) contains in its "literal2" a TFExternalFunction containing in moduleName a string pointing to the library path... maybe we should create a mecanism to cleanup all references to moduleName ?
Hi Eric Pablo will have a look.
Yes! I am checking it
Hi @ELePors, do you have any insight of what to do to reproduce it. I am trying but maybe you have any clue that might help
Hi !
A simple way to reproduce it… you open a Pharo image with Bloc samples and examples… you close them and save the image…
Then, you copy the image on another PC on which Pharo is not located in the same directories (on the firsts it is on My Documents\Pharo\vm on the second in D:\Pharo\vms)
And just try to open the image…
Eric
De : Pablo Tesone @.> Envoyé : jeudi 5 septembre 2024 11:39 À : pharo-project/pharo @.> Cc : LE PORS Eric @.>; Mention @.> Objet : Re: [pharo-project/pharo] Image crash or errors when openning another developper Pharo 11 image (Issue #17029)
Hi @ELePors https://github.com/ELePors , do you have any insight of what to do to reproduce it. I am trying but maybe you have any clue that might help
— Reply to this email directly, view it on GitHub https://github.com/pharo-project/pharo/issues/17029#issuecomment-2331063530 , or unsubscribe https://github.com/notifications/unsubscribe-auth/AI67RWOH42QBXGDQ2JNLSE3ZVARDBAVCNFSM6AAAAABNFT3DZWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMZRGA3DGNJTGA . You are receiving this because you were mentioned. https://github.com/notifications/beacon/AI67RWINPI7IIB4WI75PMXLZVARDBA5CNFSM6AAAAABNFT3DZWWGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTUK6E2OU.gif Message ID: @. @.> >
Hi Eric, with Guille we have an idea of something that might work. As we cannot reproduce it frequently enough to be sure that is working, would you mind testing it. The idea is to patch FFICalloutAPI>>#function:library: with:
New code:
function: functionSignature library: moduleNameOrLibrary
| sender ffiMethod ffiMethodSelector |
sender := self senderContext.
ffiMethodSelector := self uFFIEnterMethodSelector. "Build new method"
ffiMethod := self newBuilder
build: [ :builder |
builder
signature: functionSignature;
sender: sender;
fixedArgumentCount: fixedArgumentCount;
library: moduleNameOrLibrary ].
ffiMethod
selector: sender selector;
methodClass: sender methodClass. "Replace with generated ffi method, but save old one for future use"
ffiMethod
propertyAt: #ffiNonCompiledMethod
put: sender method. "For senders search, one need to keep the selector in the properties"
ffiMethod propertyAt: #ffiMethodSelector put: ffiMethodSelector.
sender methodClass methodDict at: sender selector put: ffiMethod. "Register current method as compiled for ffi"
FFIMethodRegistry uniqueInstance registerMethod: ffiMethod. "Resend"
sender
return: (sender receiver withArgs: sender arguments executeMethod: ffiMethod).
^ self
Nope i have still a error message... in loadSymbol:module:
The stack :
TFFIBackend>>primLoadSymbol:module:
TFFIBackend>>loadSymbol:module:
ExternalAddress class>>loadSymbol:module:
TFExternalFunction>>validate
TFSameThreadRunner>>invokeFunction:withArguments:
AeCairoImageSurface class(AeCairoSurface class)>>externallyFree:
AeCairoImageSurface class(AeCairoSurface class)>>finalizeResourceData:
FFIExternalResourceExecutor>>finalize
[
anEphemeron value finalize ] in FinalizationRegistry>>finalizeEphemeron: in Block: [ ...
FullBlockClosure(BlockClosure)>>on:do:
[ Processor terminateRealActive ] in [ :ex |
| onDoCtx handler bottom thisCtx |
onDoCtx := thisContext.
thisCtx := onDoCtx home.
"find the context on stack for which this method's is sender"
[ onDoCtx sender == thisCtx ] whileFalse: [
onDoCtx := onDoCtx sender.
onDoCtx ifNil: [ "Can't find our home context. seems like we're already forked
and handling another exception in new thread. In this case, just pass it through handler."
^ handlerAction cull: ex ] ].
bottom := [ Processor terminateRealActive ] asContext.
onDoCtx privSender: bottom.
handler := [ handlerAction cull: ex ] asContext.
handler privSender: thisContext sender.
(Process forContext: handler priority: Processor activePriority) resume.
"cut the stack of current process"
thisContext privSender: thisCtx.
nil ] in FullBlockClosure(BlockClosure)>>on:fork: in Block: [ Processor terminateRealActive ]
module = "P:\PRG\Pharo\images\Pharo 12.0 - tests\120-x64\libcairo-2.dll" moduleSymbol = # cairo_surface_destroy
i've just tried to start the image with my Pharol Launcher and the 120-x64 does not exist anymore in Pharo 12.0 - tests folder but in the vms image of Pharo.
Eric.
I am creating two PRs for the concurrency issue that leaves FFI methods around (#17118 and #17117). I think it will not fix all the problems. Also, I have seen that there are some issues in the reinitialization of Bloc when opening back windows. These errors produce some segmentation faults and keeping locks. So, the PRs will solve only a little little part...
When i open a Pharo 11 image (Pharo-11.0.0+build.726 64bits) + Bloc + Alexandrie copied from another developper PC on my PC i have an image crash due to SDL in the FFI par of the code... in the traces i find the directory of the SDL library of the other developper on its PC... when SDL want to start to call the library it uses this "cached" directory and if it does not exists -> crash
a workaround solution is to recompile all SDL package with a healdess start of Pharo with a save...
I have the same problem with LibGit also in iceberg... i do the following workaround :
FFI should not crash so i post the issue to avoid these workarounds ...
Cheers Eric.