Open theeldermillenial opened 8 months ago
@agronholm sorry for tagging you but I couldn't assign you as a reviewer.
Hey @agronholm
Just wanted to bump this to make sure it doesn't fly under the radar.
@agronholm could you please take a look at this?
This PR only seems to do this for the pure Python variant, not the C decoder?
@agronholm Yes
Well, that's not good. I can't have the two different implementations deviate any further.
Does this implementation do anything that would result in a meaningful difference between Python and C?
As far as I can tell, it does not. I'm not changing anything about how the decoder actually works. I'm just changing the scope of the function calling dictionary. There are obvious differences between the Python and C versions strictly because they are written in different languages, right? I don't see how changing the scope of the function calling dict would be any different.
If you want to suggest things I can do to align the C version of the code with Python, I am happy to implement it.
I think the proper place to start might be adding these new properties to CBORDecoder_getsetters
in decoder.c
. I don't have much bandwidth to help you with general C extension development, however. The C code was mostly contributed by another developer anyway.
I have developed c-extensions in the past, so that's not an issue. I can contribute for sure, but I don't think a contribution make sense here because this seems to be just a language difference.
I just don't see the functional difference for the scope change because the function dictionary is only ever internally called. The current way that the Python code was written, having that dictionary be external created issues if you wanted to subclass it (which you can't do with the C-extension anyway).
If you're really looking for feature parity between these two things, the way to approach it would be a substantial rewrite of either the Python code or the C-extension.
Regardless, the scoping of that dictionary is arbitrary given the current implementation of both the C and Python sections of the code.
Please correct me if I'm wrong here, but this seems like an arbitrary change. I'm definitely happy to modify the c-extension if that's what is required.
The current way that the Python code was written, having that dictionary be external created issues if you wanted to subclass it (which you can't do with the C-extension anyway).
Wait, what?
>>> from _cbor2 import CBORDecoder
>>> class A(CBORDecoder):
... pass
...
>>> dir(A)
['__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', 'decode', 'decode_array', 'decode_bigfloat', 'decode_bytestring', 'decode_datetime_string', 'decode_epoch_datetime', 'decode_float16', 'decode_float32', 'decode_float64', 'decode_fraction', 'decode_from_bytes', 'decode_ipaddress', 'decode_ipnetwork', 'decode_map', 'decode_mime', 'decode_negative_bignum', 'decode_negint', 'decode_positive_bignum', 'decode_rational', 'decode_regexp', 'decode_self_describe_cbor', 'decode_semantic', 'decode_set', 'decode_shareable', 'decode_sharedref', 'decode_simple_value', 'decode_special', 'decode_string', 'decode_stringref', 'decode_stringref_namespace', 'decode_uint', 'decode_uuid', 'fp', 'immutable', 'object_hook', 'read', 'set_shareable', 'str_errors', 'tag_hook']
If you subclass the C CBORDecoder
class, and it has setters/getters for these decoder maps, then isn't that what you want?
My apologies. I'll take a look at this.
I'm having issues building the c-extension.
It looks like a few things were changed in the c-extension that are causing the build to fail. I was wondering if it was just me, but it looks like the git actions failed to build it as well.
I am capable of building 5.6.2, but 5.6.3 seemed to create two changes that are causing the build to fail:
There is a missing declaration in a header file for _CBOR2_date_ordinal_offset
. This is an easy fix.
Error for reference:
source/decoder.c: In function ‘CBORDecoder_decode_epoch_date’:
source/decoder.c:1346:41: error: ‘_CBOR2_date_ordinal_offset’ undeclared (first use in this function)
1346 | ordinal = PyNumber_Add(num, _CBOR2_date_ordinal_offset);
|
The way dates are parsed changed to use the Python builtin date object, and it's having issues that are not immediately apparent to me how to fix. Error for reference:
source/decoder.c:1348:50: error: expected expression before ‘PyDateTime_Date’
1348 | ret = PyObject_CallMethodObjArgs(PyDateTime_Date, _CBOR2_str_fromordinal, ordinal, NULL);
| ^~~~~~~~~~~~~~~
source/decoder.c:1348:23: error: too few arguments to function ‘PyObject_CallMethodObjArgs’
1348 | ret = PyObject_CallMethodObjArgs(PyDateTime_Date, _CBOR2_str_fromordinal, ordinal, NULL);
|
I'm compiling on: Python - 3.10 gcc - 11.4 Ubuntu - 22.04
Any suggestions on how to resolve 2?
It seems like there's a deeper problem that allows the failure to build the C extension to pass unnoticed. I'm making the appropriate changes to the tox and GitHub actions workflows to prevent this from happening in the future.
As expected, this broke the CI. Well, this won't catch me by surprise ever again.
Problem solved.
This PR moves a
dict
of decoders from a global context to a class context.The purpose of this is to make subclassing
cbor2.CBORDecoder
for modification.This PR assists in resolving #224 as an intermediate step.