CSS-Electronics / mdf4-converters

Convert your MDF4 log files into popular output formats
MIT License
54 stars 20 forks source link

Add Parquet Conversion #8

Closed Saldef closed 2 years ago

Saldef commented 2 years ago

Could you add support to convert mdf4 to parquet? as it is really important for people working with Spark.

I know asammdf has the ability to convert to parquet but sometimes it doesn't work, and it is nice to have second open-source alternative when asammdf stops working.

Thanks in advance

MatinF commented 2 years ago

Hi there, thanks for the suggestion. We do not currently have plans to support a parquet output. However, the library is open source so if you do create a version for this use case, we can of course review and potentially include it in the master.

You are also welcome to share more details on your use case by contacting us via our contact form. It sounds interesting how you'd use spark for visualization of the CANedge data.

dapperfu commented 2 years ago

I've used parquet files as an intermediate for Dask data analysis.

In my case I converted to socketcan -> pandas dataframe -> saved parquet.

A short script with CSV -> pandas -> parquet should work in most cases.

MatinF commented 2 years ago

Thanks, we'll consider this. Our aim is to potentially add Dask support natively in our Python API at some point. I think the Python API would be the simplest way to solve this in the meanwhile as well, since you can load the raw/decode data into pandas dataframes quite easily, then export/process the data as you'd prefer.

Closing this, though feel free to send us a mail via our contact form to discuss further - would like to understand these use cases in more detail to learn how we can best serve 'bigger data' processing use cases.