ruby / prism

Prism Ruby parser
https://ruby.github.io/prism/
MIT License
845 stars 142 forks source link

Add ASAN job in CI #3182

Closed eregon closed 1 month ago

eregon commented 1 month ago

build with a list of leaks: https://github.com/eregon/yarp/actions/runs/11333924655/job/31519044817 cc @KJTsanaktsidis

eregon commented 1 month ago

So in https://github.com/eregon/yarp/actions/runs/11333924655/job/31519044817 there are reported leaks from:

regparse.c (just 1)
prism/prism.c (multiple)
prism/templates/src/node.c.erb (multiple)
eregon commented 1 month ago

I guess one possibility is RUBY_FREE_AT_EXIT might not clean some allocations for Prism.

kddnewton commented 1 month ago

Looks like it was all one leak that @peterzhu2118 fixed. It looks like this is detecting a leak that memcheck isn't. Peter do you know why that might be?

peterzhu2118 commented 1 month ago

It looks like this leak is happening inside of Ruby. The heuristics in ruby_memcheck are being used for prism, which filters out memory leaks from Ruby.

https://github.com/ruby/prism/blob/fd899e856b9f31cc88794016581ae383efbaead4/rakelib/test.rake#L57

kddnewton commented 1 month ago

So do I understand this that this is duplicative of the memcheck CI run?

eregon commented 1 month ago

I think ASAN detect_leaks=1 and valgrind are complementary and might each find leaks the other doesn't find, but @peterzhu2118 or @KJTsanaktsidis probably know better.

eregon commented 1 month ago

Looks like it was all one leak that @peterzhu2118 fixed.

Nice, I guess https://github.com/ruby/ruby/commit/90aa6aefc4e351d3840cfd58872c5f42a11f4c7a was the fix.


I think ideally we'd have ASAN detect_leaks=1 and/or valgrind in CRuby to find leaks in CRuby sooner and catch them directly in CI. Given it's slow probably with a subset of test-all, sounds fine to me and certainly better than nothing.

And then in Prism the same, ASAN detect_leaks=1 and/or valgrind, to avoid Prism changes to introduce leaks without noticing.

Not sure how much overlap between the two tools though. One upside of ASAN is it doesn't need to build valgrind from source every time, OTOH it's more effort to repro locally (need to build CRuby with ASAN).

peterzhu2118 commented 1 month ago

I think ideally we'd have ASAN detect_leaks=1 and/or valgrind in CRuby to find leaks in CRuby sooner and catch them directly in CI.

This has been on my todo list but I haven't been able to find the time to push it past the finish line.

One upside of ASAN is it doesn't need to build valgrind from source every time

The issue is that the Valgrind in the apt repository was too old and had bugs. Looks like the valgrind in Ubuntu 24.04 is finally new enough (>= 3.20.0): https://packages.ubuntu.com/noble/valgrind

kddnewton commented 1 month ago

So what is the next step here — should we wait until this is a part of memcheck or try to add this separately? If we're going to add this separately I definitely want to add a suppression for this leak since I don't want it to be red when we merge.

eregon commented 1 month ago

I think waiting for adding valgrind and/or ASAN in CRuby CI would make most sense. That way we don't add a(nother) job which would fail because of a leak in CRuby.

That's also what we need to make ruby-asan builds usable with detect_leaks=1: https://github.com/ruby/setup-ruby/pull/653 And it would also test RUBY_FREE_AT_EXIT remains correct and complete.

ivoanjo commented 1 month ago

Another benefit in having this in CI is that it makes it easier for downstream gems (e.g. https://github.com/DataDog/dd-trace-rb/pull/3864 ) to enable ASAN testing without needing to deal with too many exceptions and whatnot ;)

eregon commented 1 month ago

@ivoanjo Yep, that's what I meant by "That's also what we need to make ruby-asan builds usable with detect_leaks=1". But it should be CRuby CI, not Prism, otherwise we'd only catch leaks in Prism (and failed builds from leaks in CRuby), which seems already well tested with ruby_memcheck. There are/were far more leaks in CRuby itself and BTW the ones we saw in the log above were actually in the Prism compiler, which is part of CRuby and not part of this repo.

@peterzhu2118

This has been on my todo list but I haven't been able to find the time to push it past the finish line.

One idea might be to add it in CI with some suppressions initially and then address these suppressions progressively (potentially by filing separate b.r-l.o bugs about them)? It would prevent new leaks. Otherwise it feels like there will likely be more leaks and maybe never a time where CRuby has no leaks (when running a significant part of the test suite with leak detection).


In general this is a bit the wrong place to discuss this, I think it'd make sense to create a https://bugs.ruby-lang.org/ ticket to add leak checking in CRuby CI. @ivoanjo maybe you could do that since you seem most interested in it?

eregon commented 1 month ago

I'll close this because I don't think it makes sense to merge until CRuby checks leaks in its CI.