Open sosna opened 3 months ago
Hi, thanks for opening this. Segfaults are indeed concerning.
I'm unable to reproduce this locally. Doing an audit of the dealloc
/visit
/clear
methods for Meta
turns up one code path where we were failing to decref properly, which would lead to a leak of the regex
object on dealloc, not a segfault. Still worth fixing, but not the cause.
Can you share more about what you changed (how was NC_NAME_ID_TYPE
being used before, what kinds of objects referenced it, ...)?
Another debugging strategy - do you see the same segfaults if you replace pattern=...
with some other constraint like min_length=1
or something? That would help determine if the segfault was related to our regex handling.
Alternatively, if you were able to provide a reproducible example that demonstrates the segfault, that would help me debug locally :).
Hi @jcrist. Thanks for your swift reply :)
Re. how was NC_NAME_ID_TYPE
being used before: It was used as type hint on a property of a Struct
. I simply replaced the type to str
for the time being. You can see an example of how it was used here: https://github.com/bis-med-it/pysdmx/blob/60692119bc5f6a4998cf7c3f85e4aa040d102414/src/pysdmx/api/qb/data.py#L337
Re. other types of Meta: Indeed, I searched for other instances and they appear to work fine. For example, we have snippets like the one below, and our CI pipeline does not complain. So, it could indeed be related to the regex handling, as you hinted.
first_n_obs: Optional[Annotated[int, msgspec.Meta(gt=0)]] = None
Re. the reproducible example: I have forked the project into my account to continue the debugging. You can access the list of commits in the appropriate branch here: https://github.com/sosna/pysdmx/commits/various_fixes/. You can see that the CI job for last commit (which removes the NC_NAME_ID_TYPE) succeeded (I ran it multiple times, to be sure), while the CI jobs for all the previous ones failed.
I hope this helps, but else, let me know :)
Description
Hey there,
First of all, thanks for this very nice library.
I wanted to report an issue, that seems to be related to msgspec, but you will be able to better assess whether that is indeed the case.
Our CI pipelines in GitHub started failing randomly (say, 20% failure rate), with a segmentation fault error. The failures disappeared after rerunning the failing jobs (one or more times depending on our luck). The failure was when running pytest (using python -m pytest). After setting python execution to a verbose mode, we noticed that, when it failed, then it always failed when destroying the following module:
pysdmx.model.types
. This module usesmsgspec.Meta
.This module was very simple:
Once we removed this, our CI pipelines started working again, 100% of the time. So, our suspicion is that there might be cases when using
msgspec.Meta
prevents created objects from being destroyed, but this is just a guess based on the log above and the fact that the error did not occur again when we stopped using that class. However, it could also be that our suspicion is wrong. In which case, feel free to close this issue :-).Thanks, Xavier