Open provinzkraut opened 1 year ago
My gut reaction is I don't think this feature makes sense in msgspec
. Unlike the other builtin formats, csvs are not really standardized, which is why pandas.read_csv
has a whopping 49 configuration parameters.
Also, since CSVs are tabular in nature I'd recommend using one of python's many existing tabular apis (pandas, polars, pyarrow, ...). These native representations avoid creating a PyObject
per element, and can be much more efficient than anything I'd implement in msgspec. I'm biased here though - my day job is doing python data stuff, I'm less familiar with what web engineers need.
I am curious though - what's your use case for mixing msgspec & csvs? If it's "easy enough to implement yourself", would you be open to contributing an example to the examples directory showing how this would work?
My gut reaction is I don't think this feature makes sense in
msgspec
. Unlike the other builtin formats, csvs are not really standardized, which is whypandas.read_csv
has a whopping 49 configuration parameters.
My suggestion would have been to simply base this on what the standard library provides. Its csv
module is solid, but not as configurable as others.
Also, since CSVs are tabular in nature I'd recommend using one of python's many existing tabular apis (pandas, polars, pyarrow, ...). These native representations avoid creating a
PyObject
per element, and can be much more efficient than anything I'd implement in msgspec.
I imagined it to be more for convenience than performance. Similar to how tomli
is wrapped by the msgspec.toml
module, instead of building a toml parser into msgspec (=
I am curious though - what's your use case for mixing msgspec & csvs?
Twofold.
msgspec.Struct
and dataclasses, it would be nice if I could just msgspec.csv.decode(<raw csv data>, type=list[RowModel])
. Right now I've added a wrapper that achieves this using the standard library's csv
modulemsgspec.Struct
already and I need to turn it into CSV. Same as before, I've added a wrapper around csv
, but having msgspec.csv.encode(<list of structs>)
work would be niceIf it's "easy enough to implement yourself", would you be open to contributing an example to the examples directory showing how this would work?
Sure thing! If you say this isn't something that should be part of msgspec I could contribute my stuff as an example, otherwise I'd also be open to implement support for it in a similar fashion to the yaml
and toml
submodules in case you'd want to go for that (=
Thanks for the extra info. I think for now I'd like to leave this out of msgspec
proper. If another user asks for it we can always add it then, but it's harder to remove support for something later.
If you have the time, I'd love an example in the examples
directory showing how to integrate msgspec with the csv
module. It'd be a nice example of using msgspec.convert
/msgspec.to_builtins
to support a new protocol.
If you have the time, I'd love an example in the
examples
directory showing how to integrate msgspec with thecsv
module. It'd be a nice example of usingmsgspec.convert
/msgspec.to_builtins
to support a new protocol.
Will do!
I have a use case where I'd also like to write to CSV/TSV and have built a wrapper to do so. @provinzkraut did you end up building an example? I'd love to see one for comparison sake!
Description
How do you feel about adding CSV support, similar to what's provided in the
yaml
andtoml
modules?It's easy enough to implement yourself, but I feel like it would be nice to have (=