openzim / node-libzim

Libzim binding for Node.js: read/write ZIM files in Javascript
https://www.npmjs.com/package/@openzim/libzim
GNU General Public License v3.0
27 stars 11 forks source link

Segmentation Fault when using a custom item #80

Open kelvinhammond opened 2 years ago

kelvinhammond commented 2 years ago

In libzim v8 when using a custom item and adding a lot of objects to a zim file it will throw a segmentation fault error after about 10,000 items have been inserted.

The error has been reported to nodejs/node-addon-api below.

It appears to be related to:

To reproduce the error modify tests/makeLargeZim.ts file to use the custom item and run the test, it should fail.

The issue has been reported to https://github.com/nodejs/node-addon-api/issues/1174 .

Related to https://github.com/openzim/node-libzim/pull/72

kelson42 commented 1 year ago

@kelvinhammond One hour ago, I have updated the build script to use libzim 8.1.0 which has been released yesterday (and fix/update the CI). I have also run (for the first time) $ npm run test-mem-leak twice on my Ubuntu 22.04 + Node.js 18.12.1 and... all works fine. It ran twice up to the end without any error message. Maybe it would be worth it you try again?

kelvinhammond commented 1 year ago

@kelson42 Nice, will try again. I wonder if they fixed it! Thank you.

I should have a bit more time soon (hopefully) to take a look at this and mwoffliner during the holidays.

kelson42 commented 1 year ago

@kelvinhammond Thx, would be really helpful to know soon if this critical bug is really fixed or if I just don't run the reproduction case steps properly.

kelvinhammond commented 1 year ago

@kelvinhammond Thx, would be really helpful to know soon if this critical bug is really fixed or if I just don't run the reproduction case steps properly.

@kelson42 Try it with these changes Edit test/makeLargeZim.ts Uncomment the const customItem code and have await creator.addItem(stringItem); be replaced by await creator.addItem(customItem); and see if that works for you.

The diff looks like this.

diff --git a/test/makeLargeZim.ts b/test/makeLargeZim.ts
index b5db866..e87979e 100644
--- a/test/makeLargeZim.ts
+++ b/test/makeLargeZim.ts
@@ -43,13 +43,13 @@ console.log(`Making ZIM file with [${numArticles}] articles`);
       data,
     );

-    /*
     const customItem = { // custom item
       path: url,
       mimeType: "text/html",
       title: title,
       hints: {FRONT_ARTICLE: 1},
       getContentProvider() { // custom content provider
+        return stringItem.getContentProvider();
         let sent = false;
         return {
           size: data.length,
@@ -63,9 +63,9 @@ console.log(`Making ZIM file with [${numArticles}] articles`);
         };
       },
     };
-    */

-    await creator.addItem(stringItem);
+    //await creator.addItem(stringItem);
+    await creator.addItem(customItem);
   }

   console.log(`Finalising...`);
kelvinhammond commented 1 year ago

I tested it with node v19.1.0

I get

terminate called after throwing an instance of 'Napi::Error'

  what():  Illegal invocation

#8  0x00007ffff031a5c3 in Napi::Function::Call (this=0x7fffffffa0c0, recv=0x555555632990, argc=0, args=0x0) at 
/node-libzim/node_modules/node-addon-api/napi-inl.h:2341

#9  0x00007ffff031a482 in Napi::Function::Call (this=0x7fffffffa0c0, recv=0x555555632990, args=std::initializer_list of length 0) at /node-libzim/node_modules/node-addon-api/napi-inl.h:2306

#10 0x00007ffff0326f92 in ContentProviderWrapper::feed()::{lambda(Napi::Env, Napi::Function)#1}::operator()(Napi::Env, Napi::Function) const (__closure=0x7fffc40245f0, env=..., feedFunc=...) at ../src/contentProvider.h:79

Looking into this more.

kelvinhammond commented 1 year ago

@kelson42 It does work, just not with custom items.

kelson42 commented 1 year ago

@kelvinhammond Sorry for the big delay in answering. I have involved @mgautierfr who is the lead developer of the libzim. Could you please elaborate what are these "custom items". This is not a concept of the libzim and therefore this is a bit puzzling. One important other point is if we really need them (does that impact MWoffliner)?

kelson42 commented 1 year ago

BTW @uriesk, this is the bug which stops us to use latest libzim in MWoffliner (we are stuck with libzim6 which is18 months old). If you like to be challenged, this might be one for you ;)

kelvinhammond commented 1 year ago

@kelson42 In libzim you can create Item(s) which have each have a ContentProvider, some are provided by default such as StringItem and FileItem but you can define and implement your own. In node-libzim StringItem and FileItem work fine but when you try to define your own it fails because of an issue (node-addon-api#1174) in node or node-addon-api (napi) when creating a large file (sometimes a smaller one too)

kelson42 commented 1 year ago

@kelvinhammond But we don't need this customItem in MWoffliner? You confirm?

kelvinhammond commented 1 year ago

@kelvinhammond But we don't need this customItem in MWoffliner? You confirm?

I'll test a large file using only StringItem and FileItem to confirm it works but yes, mwoffliner does not use the custom item and content providers

kelson42 commented 1 year ago

@kelvinhammond This would be a good news. Waiting to your confirmation. If confirmed, we could consider to release with this bug/limitation.

uriesk commented 1 year ago

BTW @uriesk, this is the bug which stops us to use latest libzim in MWoffliner (we are stuck with libzim6 which is18 months old). If you like to be challenged, this might be one for you ;)

sorry, it's out of my league. I am not yet familiar enough with node-addon-api.

kelson42 commented 1 year ago

@kelson42 In libzim you can create Item(s) which have each have a ContentProvider, some are provided by default such as StringItem and FileItem but you can define and implement your own. In node-libzim StringItem and FileItem work fine but when you try to define your own it fails because of an issue (node-addon-api#1174) in node or node-addon-api (napi) when creating a large file (sometimes a smaller one too)

@mgautierfr Do you confirm this (not sure this is exactly how things work at the libzim)? Maybe there is a few things to know and check if using a custom provider?

mgautierfr commented 1 year ago

One important thing to remember is that the ContentProvider returned by getContentProvider must stay valid after libzim has release the shared_ptr<Item> passed in creator::addItem.

libzim will use the shared_ptr<Item> to get a ContentProvider but will release the shared ptr before using the ContentProvider (potentially in another thread). So if ContentProvider is referencing the data stored in the item. We may have a situation where we free the item (and so, its data) and we have a ContentProvider referencing a freed data. I don't know how item_.Value() behaves. But if it return a reference/pointer to the value we are in this situation. ContentProvider must prevent the value it will return from being freed. Either by owning the data, copying it, keeping a reference to the item containing the data, ...

I don't know how napi wrap things. But here : https://github.com/openzim/node-libzim/blob/main/test/makeLargeZim.ts#L54-L62, your custom item return a contentProvider which contain a closure as feed method. So you must be sure that the referenced data (literally data) is not freed when libzim is calling feed method. As libzim keep a reference to the contentProvider without keeping a reference data maybe NAPI think that nothing reference the data variable and free it. And depending of how closures work in js, it can be worse as all contentprovider may share the same data. At least in python, this is the case (see my own old answer in stackoverflow at this subject : https://stackoverflow.com/questions/21449176/issue-with-generating-ttk-checkboxes-in-loops-and-passing-arguments/21459922#21459922)

Maybe you can try this custom item which not create closure (but copy (reference to?) data):

const customItem = { // custom item
  path: url,
  mimeType: "text/html",
  title: title,
  hints: {FRONT_ARTICLE: 1},
  item_data: data, // reference directly the data without closure
  getContentProvider() { // custom content provider
    return {
      size: this.item_data.length,
      data_to_send: this.item_data, // same here, we don't want the feed() method being a closure accessing the `item_data` in customItem which will be deleted.
      sent = false,
      feed() {
        if(!this.sent) {
          this.sent = true;
          return new Blob(this.data_to_send);
        }
        return new Blob();
      }
    };
  },
};
kelson42 commented 1 year ago

@kelvinhammond Does that help? Do we have anything preventing the data to be freed?

kelvinhammond commented 1 year ago

@kelson42 Sadly, it did not. I store a reference to the objects and functions in node so it "should" keep a copy of it in memory without garbage collecting it. The error also seems to happen (most of the time) when the object is creating a thread safe function callback.

As for the StringItem, that works, I haven't verified FileItem yet (I need to write a bit of code to test this too in a large tmp folder for each article we store). StringItem and FileItem work because they don't encounter the bug calling to getContentProvider and feed because I use an internally stored StringItem and FileItem smart ptr. I can test FileItem to be sure but I think it's ready to go. Is there a way to test mwoffliner with a large enough (doesn't need to be huge) archive to verify it works?

kelson42 commented 1 year ago

Is there a way to test mwoffliner with a large enough (doesn't need to be huge) archive to verify it works?

You mean with a big wikipedia and your PR?

kelson42 commented 1 year ago

Considering that this bug does not impact our current scrapers AFAIK, I propose to release 3.0.0 with this bug, so we can move forward and benefit from all the work already done.

@kelvinhammond @mgautierfr Any thpughts?

kelvinhammond commented 1 year ago

Considering that this bug does not impact our current scrapers AFAIK, I propose to release 3.0.0 with this bug, so we can move forward and benefit from all the work already done.

@kelvinhammond @mgautierfr Any thpughts?

Sure, release it but include a note in the change notes please.