Open Eboreg opened 2 years ago
Update: I find it works fine for text files for some reason, but only if I take the strings returned by MegaTransfer.getLastBytes()
, convert them to bytes
, and then crop them to the length given by MegaTransfer.getDeltaSize()
. Like so:
buffer_str = transfer.getLastBytes()
delta_size = transfer.getDeltaSize()
buffer_bin = buffer_str.encode("utf-8", errors="surrogateescape")[:delta_size]
self.buffers_bin.append(buffer_bin)
I can then do b"".join(transfer_listener.buffers_bin).decode()
, which gives me an exact copy of the original text.
Why it fails so miserably for binary files, though, is still a mystery to me.
I would like to try building the Python bindings with SWIG_PYTHON_STRICT_BYTE_CHAR
, as per the SWIG documentation: http://swig.org/Doc4.0/Python.html#Python_nn77
How to do that is unfortunately beyond my competence at the moment.
Been doing a little more debugging, and it seems that whatever generates the return value of MegaTransfer.getLastBytes()
stops as soon as it encounters a null character.
E.g. if I do api.startStreaming(node, 0, 1000, transfer_listener)
, and there is a null at position 10 in the file, I only get characters 0 through 9 in return, even if character 11 is non-null.
I guess this makes sense, as SWIG assumes that a returned char *
value is a null-terminated string (source). But that's not really helpful in this case.
I managed to build the SDK with #define SWIG_PYTHON_STRICT_BYTE_CHAR
. Everything is indeed bytes
instead of str
now, but unfortunately that didn't solve anything. The returned values still stop at the first null character.
Am I configuring the build wrong? Or is startStreaming()
simply not meant to be used for binary files?
From the generated bindings/python/megaapi_wrap.cpp
:
SWIGINTERN PyObject *_wrap_MegaTransfer_getLastBytes(PyObject *SWIGUNUSEDPARM(self), PyObject *args) {
PyObject *resultobj = 0;
mega::MegaTransfer *arg1 = (mega::MegaTransfer *) 0 ;
void *argp1 = 0 ;
int res1 = 0 ;
PyObject *swig_obj[1] ;
char *result = 0 ;
if (!args) SWIG_fail;
swig_obj[0] = args;
res1 = SWIG_ConvertPtr(swig_obj[0], &argp1,SWIGTYPE_p_mega__MegaTransfer, 0 | 0 );
if (!SWIG_IsOK(res1)) {
SWIG_exception_fail(SWIG_ArgError(res1), "in method '" "MegaTransfer_getLastBytes" "', argument " "1"" of type '" "mega::MegaTransfer const *""'");
}
arg1 = reinterpret_cast< mega::MegaTransfer * >(argp1);
{
SWIG_PYTHON_THREAD_BEGIN_ALLOW;
result = (char *)((mega::MegaTransfer const *)arg1)->getLastBytes();
SWIG_PYTHON_THREAD_END_ALLOW;
}
resultobj = SWIG_FromCharPtr((const char *)result);
return resultobj;
fail:
return NULL;
}
SWIGINTERNINLINE PyObject *
SWIG_FromCharPtr(const char *cptr)
{
return SWIG_FromCharPtrAndSize(cptr, (cptr ? strlen(cptr) : 0));
}
SWIGINTERNINLINE PyObject *
SWIG_FromCharPtrAndSize(const char* carray, size_t size)
{
if (carray) {
if (size > INT_MAX) {
swig_type_info* pchar_descriptor = SWIG_pchar_descriptor();
return pchar_descriptor ?
SWIG_InternalNewPointerObj(const_cast< char * >(carray), pchar_descriptor, 0) : SWIG_Py_Void();
} else {
#if PY_VERSION_HEX >= 0x03000000
#if defined(SWIG_PYTHON_STRICT_BYTE_CHAR)
return PyBytes_FromStringAndSize(carray, static_cast< Py_ssize_t >(size));
#else
return PyUnicode_DecodeUTF8(carray, static_cast< Py_ssize_t >(size), "surrogateescape");
#endif
#else
return PyString_FromStringAndSize(carray, static_cast< Py_ssize_t >(size));
#endif
}
} else {
return SWIG_Py_Void();
}
}
I notice SWIG_FromCharPtrAndSize()
is called with a size argument generated by strlen()
. And that function of course assumes it's dealing with a null-terminated string. The question is, could and should I do anything differently in order to avoid this? It seems to me like the reasonable thing would be for _wrap_MegaTransfer_getLastBytes()
to call SWIG_FromCharPtrAndSize()
directly, using the same size as reported by getDeltaSize()
.
I had this problem recently, my solution is based on some changes to the megaapi_wrap.cpp
file, I leave you the patch that I applied to version 3.12.0. U can apply using patch megaapi_wrap.cpp megaapi_wrap.txt
megaapi_wrap.txt
@jorgeajimenezl Thanks! I was thinking along the same lines myself. Manually patching an auto generated file is of course not the optimal solution, but it's better than nothing. :)
Thanks dude
On Mon, 23 May 2022, 11:24 pm Robert Huselius, @.***> wrote:
@jorgeajimenezl https://github.com/jorgeajimenezl Thanks! I was thinking along the same lines myself. Manually patching an auto generated file is of course not the optimal solution, but it's better than nothing. :)
— Reply to this email directly, view it on GitHub https://github.com/meganz/sdk/issues/2612#issuecomment-1135054269, or unsubscribe https://github.com/notifications/unsubscribe-auth/AWAVHGUAQFPVZJD6JXOG4STVLPLOTANCNFSM5P7BXY2Q . You are receiving this because you are subscribed to this thread.Message ID: @.***>
I am trying to use
MegaApi.startStreaming()
via the Python bindings, but myMegaTransferListener.onTransferUpdate()
reports huge differences between the values returned byMegaTransfer.getDeltaSize()
and the lengths of the byte arrays I actually get fromMegaTransfer.getLastBytes()
, and so only a fraction of the file is actually received.My debug listener:
I got the
.encode()
thing I do on the received string from the SWIG docs, so I guess it's the correct way to do it?I also tried handling the returned data in
onTransferData()
instead, but it just had the exact same result.I am testing this out by using a
MegaNode
belonging to a known file, and sending it tostartStreaming()
like so:However, this is some of what the listener above logs:
So as you see,
getLastBytes()
consistently returns a much smaller amount of data than whatgetDeltaSize()
reports. And I know for a fact that the actual file is 29458186 bytes.I compiled with the following
configure
arguments (but I have tried various other combinations as well):--disable-silent-rules --enable-python --with-python3 --disable-examples --enable-debug --enable-doxygen-html
Is there something obvious I'm missing here? I really hope someone can help me out a little.