zhangyuang / node-ffi-rs

Implement ffi in Node.js by Rust and NAPI
MIT License
191 stars 7 forks source link

How to obtain null-terminated utf8 string? #45

Closed neko-para closed 5 months ago

neko-para commented 5 months ago

Descibe your problem in detail

How can I obtain a returned null-terminated utf8 string?

const char* MaaVersion();

According to comment in dts, DataType.String is representing char16 instead of char8. But in document the demo function concatenateStrings uses char8*. I'm not sure if there has been some breaking changes.

I've checked if I can use external as a workaround, but it seems that I cannot iterate the pointer data to calculate the length.

Currently, I've managed to call strlen on it and got the length, but then retrieving the data using arrayConstructor cause a crash.

export function version() {
  const ptr = load({
    library: 'maafw',
    funcName: 'MaaVersion',
    paramsType: [],
    paramsValue: [],
    retType: DataType.External
  })
  const len = load({
    library: 'libc',
    funcName: 'strlen',
    paramsType: [DataType.External],
    paramsValue: [ptr],
    retType: DataType.I32
  })
  console.log(ptr, len)
  const result = load({
    library: 'maafw',
    funcName: 'MaaVersion',
    paramsType: [],
    paramsValue: [],
    retType: arrayConstructor({
      type: DataType.U8Array,
      length: len
    })
  })
  console.log('result', result)
  return result
}
PS E:\Projects\MAA\maa-ffi-rs> tsx .\test\load.ts
[External: 1c0001bbee0] 40
PS E:\Projects\MAA\maa-ffi-rs> 
zhangyuang commented 5 months ago

In JavaScript, the type of jsString is utf-16. In order to compatible with all cases, we need to transform jsString to utf16 string.And then, ffi-rs will create c string from utf16 bytes array

neko-para commented 5 months ago

Yes, I know node use ucs2. The problem here is that a utf8 string is returned, and I have no idea how to transform or retrieve the data. I can manually decode utf8 if I can get the data, though.

zhangyuang commented 5 months ago

🤔,It looks like you can use DataType.External to transform the raw pointer between different c function.If you get the error result, you can provide the simply reproduction

neko-para commented 5 months ago

The code above should be enough, the MaaVersion is a simple function that returns a string literal.

neko-para commented 5 months ago

🤔,It looks like you can use DataType.External to transform the raw pointer between different c function.If you get the error result, you can provide the simply reproduction

What I need isn't transfer the pointer around, but getting the pointed data. I do pass the pointer as External to strlen, and get the length of it.

zhangyuang commented 5 months ago

Ok,i will deal with that later

zhangyuang commented 5 months ago

I try it with this code below can get correct length of string and buffer data

 const ptr= load({
    library: "libsum",
    funcName: "concatenateStrings",
    retType: DataType.External,
    paramsType: [DataType.String, DataType.String],
    paramsValue: [c, d],
  })
  const len =   load({
    library: "libnative",
    funcName: "strlen",
    retType: DataType.I32,
    paramsType: [DataType.External],
    paramsValue: [ptr],
  })
 equal(len, 6)
 const foo= load({
    library: "libsum",
    funcName: "concatenateStrings",
    retType: arrayConstructor({
      type: DataType.U8Array,
      length: len
    }),
    paramsType: [DataType.String, DataType.String],
    paramsValue: [c, d],
  })
  equal(foo.toString(), "foobar")
neko-para commented 5 months ago

I try it with this code below can get correct length of string and buffer data

 const ptr= load({
    library: "libsum",
    funcName: "concatenateStrings",
    retType: DataType.External,
    paramsType: [DataType.String, DataType.String],
    paramsValue: [c, d],
  })
  const len =   load({
    library: "libnative",
    funcName: "strlen",
    retType: DataType.I32,
    paramsType: [DataType.External],
    paramsValue: [ptr],
  })
 equal(len, 6)
 const foo= load({
    library: "libsum",
    funcName: "concatenateStrings",
    retType: arrayConstructor({
      type: DataType.U8Array,
      length: len
    }),
    paramsType: [DataType.String, DataType.String],
    paramsValue: [c, d],
  })
  equal(foo.toString(), "foobar")

I've checked with exactly the same code, but program still crashed on the last load happen. Is there any way to inspect what happened?

zhangyuang commented 5 months ago

Please provide your arch and platform and the version of ffi-rs and print the value of len

neko-para commented 5 months ago

x86_64, windows, ffi-rs@1.0.79 image


export function version() {
  const ptr = load({
    library: 'maafw',
    funcName: 'test',
    paramsType: [DataType.String, DataType.String],
    paramsValue: ['1', '2'],
    retType: DataType.External
  })
  const len = load({
    library: 'libc',
    funcName: 'strlen',
    paramsType: [DataType.External],
    paramsValue: [ptr],
    retType: DataType.I32
  })
  console.log(ptr, len)
  const result = load({
    library: 'maafw',
    funcName: 'test',
    paramsType: [DataType.String, DataType.String],
    paramsValue: ['1', '2'],
    // DataType.External
    retType: arrayConstructor({
      type: DataType.U8Array,
      length: len
    })
  })
  console.log('result', result)
  return result
}
typedef const char* MaaStringView;

extern "C" MAA_FRAMEWORK_API MaaStringView test(MaaStringView a, MaaStringView b)
{
    std::ignore = a;
    std::ignore = b;
    return MAA_VERSION; // a git hash, char[40]
}
zhangyuang commented 5 months ago

o, it looks like the type of your return value is char[40] not char*? 🤔

neko-para commented 5 months ago

no, the returning type and the type of MAA_VERSION macro IS const char* image

neko-para commented 5 months ago

After all, strlen does have calculated the length of the string.

zhangyuang commented 5 months ago

The reasons of crash could be the length of pointer position is invalid or the memory of string has been freed. you can try with adjust the value of len to one or two to debug it or provide the reproduction which i can test it on macos

zhangyuang commented 5 months ago

You can test succeed with concatenateStrings?

neko-para commented 5 months ago

The problem is that I've missed freeResultMemory option. It's quite counter-intuitive to have it enabled by default. 🤔

neko-para commented 5 months ago

But still, currently I have to call this function twice, which sometimes not available. I've tried restorePointer, but it seems that not designed for this situation.

zhangyuang commented 5 months ago

So that is the reason, although use freeResultMemory as default is counter-intuitive but we can't gurantee all of dynamic library will manage their memory.To aviod memory leak we should do this.

zhangyuang commented 5 months ago

If the type of the return value is char*, ffi-rs will free the pointer memory after ffi call.To avoid this you can use external as retType or set freeResultMemory to false

neko-para commented 5 months ago

So that is the reason, although use freeResultMemory as default is counter-intuitive but we can't gurantee all of dynamic library will manage their memory.For aviod memory leak we should do this.

Actually, how do ffi-rs free these memory? Calling libc free on memory allocated via new seems to be implementation defined (while it does work on three main platform).

zhangyuang commented 5 months ago

Yes, ffi-rs will call libc::free to free the memory which allocated on c side

neko-para commented 5 months ago

If the type of the return value is char*, ffi-rs will free the pointer memory after ffi call.To avoid this you can use external as retType or set freeResultMemory to false

But still, currently I have to call this function twice, which sometimes not available. I've tried restorePointer, but it seems that not designed for this situation.

I mean, currently I need to call the function twice, first to get the length, then use that length to get the data. But this isn't always available, like sometimes functions may have side effects and shouldn't been called twice.

Is it possible to use restorePointer or something else to directly unpack that external ptr?

export function version() {
  const ptr = load({
    library: 'maafw',
    funcName: 'MaaVersion',
    paramsType: [],
    paramsValue: [],
    retType: DataType.External
  })
  const len = load({
    library: 'libc',
    funcName: 'strlen',
    paramsType: [DataType.External],
    paramsValue: [ptr],
    retType: DataType.I32
  })
  console.log(ptr, len)
  // here's what I'm attempting to do.
  const result = restorePointer({
    retType: [
      arrayConstructor({
        type: DataType.U8Array,
        length: len
      })
    ],
    paramsValue: [ptr]
  })

  // const result = load({
  //   library: 'maafw',
  //   funcName: 'MaaVersion',
  //   paramsType: [],
  //   paramsValue: [],
  //   // DataType.External
  //   retType: arrayConstructor({
  //     type: DataType.U8Array,
  //     length: len
  //   }),
  //   freeResultMemory: false
  // })
  console.log('result', result.toString())
  return result
}
zhangyuang commented 5 months ago

Yes,you can use restorePointer to restore the data of pointer. Is there any problem?

neko-para commented 5 months ago

The code above crashs again. I suspect I did something wrong. According to my understandng, I need code below, which doesn't match the type decleration, and will cause runtime error too:

  const result = restorePointer({
    retType: arrayConstructor({
      type: DataType.U8Array,
      length: len
    }) as any,
    paramsValue: [ptr]
  })

The demo in README only shows how to unpack pointer to array, which should lead to unref pointer twice.

zhangyuang commented 5 months ago

restorePointer corresponds the result of createPointer.

Use createPointer to create a pointer point to string will return a pointer point to char* that is char**.

You can use wrapPointer to create a multiple pointer

 const p = load({
    library: "libsum",
    funcName: "concatenateStrings",
    retType: DataType.External,
    paramsType: [DataType.String, DataType.String],
    paramsValue: [c, d],
  })
  const res = restorePointer({
    paramsValue: wrapPointer([p]),
    retType: [
      arrayConstructor({
        type:DataType.U8Array,
        length: 6
      })
    ]
  })
  console.log('xx',res, res.toString())
neko-para commented 5 months ago

restorePointer corresponds the result of createPointer.

Use createPointer to create a pointer point to string will return a pointer point to char* that is char**.

You can use wrapPointer to create a multiple pointer

 const p = load({
    library: "libsum",
    funcName: "concatenateStrings",
    retType: DataType.External,
    paramsType: [DataType.String, DataType.String],
    paramsValue: [c, d],
  })
  const res = restorePointer({
    paramsValue: wrapPointer([p]),
    retType: [
      arrayConstructor({
        type:DataType.U8Array,
        length: 6
      })
    ]
  })
  console.log('xx',res, res.toString())

This works. Also, I've noticed that I can use DataType.String now, with freeResultMemory: false.

zhangyuang commented 5 months ago

Because of some user's feedback some dynamic library will manage their memory automatically, ffi-rs will set freeResultMemory and freeCFuncParamsMemory to false at default from 1.0.80