ohler55 / oj

Optimized JSON
http://www.ohler.com/oj
MIT License
3.14k stars 251 forks source link

Stop using the deprecated `Data_*` API #905

Closed casperisfine closed 10 months ago

casperisfine commented 10 months ago

The replacement API was introduced in Ruby 1.9.2 (2010), and the old untyped data API was marked a deprecated in the documentation as of Ruby 2.3.0 (2015)

Ref: https://bugs.ruby-lang.org/issues/19998

I made the minimal migration, which simply use TypedData, but it open the door to enable many GC features such as write barriers, immediate free, compaction support etc. But to avoid accidentally introducing any bugs I didn't do any of that, I kept changes the minimum.

On a side note I noticed this:

    if (given || 0 != ex) {
        DATA_PTR(doc->self) = NULL;
        // TBD is this needed?
        /*
        doc_free(pi.doc);
        if (0 != ex) {  // will jump so caller will not free
            OJ_R_FREE(json);
        }
        */

To answer the comment, yes this is most definitely a memory leak, but it's unclear to me why the data pointer is even cleared in the first place, instead of letting the object free the memory. So since I don't understand why this is done I didn't attempt to fix it.

ohler55 commented 10 months ago

Please fix the clang formatting errors and I'll merge.

casperisfine commented 10 months ago

I'll try, but if I may the contributing experience is awful. I have clang-format locally and it wants to makes changes different from the GitHub Action, and the GitHub action doesn't tell what it thinks it wrong.

And the clang-format --help is terrible.

casperisfine commented 10 months ago

Ok, the linter is happy now.

There seem to be a consistent error on 3.2, but I don't see how my changes could have caused that:

  1) Error:
GCTest#test_parse_object_gc:
Oj::ParseError: can not add attributes to a #<Class:0x00000001086bce58> (after child.child.child.child.child.child.child.child.child.child.chi
    test/test_gc.rb:49:in `object_load'
    test/test_gc.rb:49:in `test_parse_object_gc'
ohler55 commented 10 months ago

I've had mixed success with clang-formatter. The .clang-format file helps but there are variations with versions. The formatter was a solution to people submitting all sorts of coding styles and not following the one used in the rest of the code. The clang-format I've configured isn't exactly what I like but at least it is consistent.

ohler55 commented 10 months ago

I don't see the error on previous versions. I wonder if there was an accidental change that caused it. If you can debug it that would be great. I just fixed a migration issue with the old JSON gem tests for 3.2 in the raise-on-empty branch. I plan to merge that tonight. Maybe when we merge with your changes CI will pass.

casperisfine commented 10 months ago

The clang-format I've configured isn't exactly what I like but at least it is consistent.

Yeah, I totally understand the upside.

I don't see the error on previous versions. I wonder if there was an accidental change that caused it.

yeah that's weird. I'm pretty sure It was green before the clang-format changes. I'll try to have a look.

casperisfine commented 10 months ago

Ok, so bisecting is pointing the finger at 86e5e7934375d982b91f42ac466d560e4c11a26a. I'll keep digging.

casperisfine commented 10 months ago

Interesting, it fails in static void hash_set_value().

switch (rb_type(parent->val)) { end up in the default: case because it's "garbage":

(lldb) p *(struct RBasic *)parent->val
(struct RBasic)  (flags = 0, klass = 4353342000)

There must be a marking missing somewhere.

casperisfine commented 10 months ago

Alright, I'm an idiot:

static const rb_data_type_t oj_stack_type = {
    "Oj/stack",
    {
        NULL, // dmark
        stack_mark, // dfree
        NULL,

I was passing the mark function as a free function šŸ¤¦ .

It should go green now.

ohler55 commented 10 months ago

The remaining error is expected and has been fixed in another branch so all looks good. Thanks.