devsgnr / breadroll

breadroll 🥟 is a simple lightweight library for data processing operations written in Typescript and powered by Bun.
https://breadrolljs.vercel.app
MIT License
66 stars 0 forks source link

Feature request: strongly typed dataframes #21

Open itsyoboieltr opened 3 months ago

itsyoboieltr commented 3 months ago

The big advantage of using this over python would be speed, and also that it could be made completely type-safe. If the user knows the rows/columns ahead of time (which is most likely the case) I would suggest we add the possibility to pass a type parameter.

import Breadroll, { Dataframe } from 'breadroll';

interface Row {
  name: string;
  age: number;
  city: string;
}

const csv: Breadroll = new Breadroll({ header: true, delimiter: ',' });

const df: Dataframe<Row> = await csv.open.local<Row>('./data/input.csv');

This would ideally make all functions completely type-safe: you should not be able to access columns that do not exist, you have autocomplete for filtering, etc.

devsgnr commented 3 months ago

Big thanks, @itsyoboieltr, this is a fantastic feature suggestion! ✨ it's already rocketed its way onto the roadmap, as we think about methods of implementation. We'll see how we can ship this :shipit: as soon as we can.

itsyoboieltr commented 3 months ago

In addition to the type-level safety, maybe there should also be an option for runtime validation too, similar to how trpc does it - Input & Output Validators. This would probably be too costly for very large datasets, but for small-to-medium sized ones could prove to be useful, to make sure the data is sane. Interestingly, most of these validators also support some kind of data transformation, encoding-decoding, etc, so allowing their use in some way would immediately increase the array of features this package also provides :D

devsgnr commented 3 months ago

These are some great insights, data transformation like encoding and decoding are one of the next things we are looking at, since we already have the .apply method. 🚀 Thanks