lawrennd / linguine

Description of the linguine data oriented architecture interface standard.
1 stars 0 forks source link

possible collab via croissant? #1

Open luisoala opened 2 days ago

luisoala commented 2 days ago

hi neal

just saw this repo pop up in my gh feed

i was wondering if youd be interested to explore synergies w croissant? https://github.com/mlcommons/croissant

we have also started exploring a more expansive scope of data interfaces, dubbed "tasks" for now, that go beyond description of just data to include more context how the data is to be used (e.g. w a sample model, metric, baseline score for the metric)

lawrennd commented 2 days ago

Hi Luis,

That could be very interesting!

I was prepping this page as a first draft to share with @cabrerac and @apaleyes (they're now seeing this for the first time!).

It's part of our wider work around data oriented architectures.

https://arxiv.org/pdf/2302.04810

So there's definitely synergy between what we're looking at and standardisation of data sets.

It's derived from workflows I've got, not just for ML but for general data processing, they're implemented in

https://github.com/lawrennd/lynguine (not the y instead of i).

and

https://github.com/lawrennd/referia

Neil

luisoala commented 18 hours ago

neil

first of, apologies for misspelling your name :sweat:

this looks amazing!

ill try to use it to get a better sense of the workflow

overall, the broad scope for a general abstraction for data processing, whether ml or not, sounds great, especially wrt composability

i dont want to interfere w your momentum

but if you are interested in synergies, here are a few low stakes ideas in order of complexity

1) you present your vision to the croissant group (usually meet wednesday afternoon european time)

2) omar, joaquin, isabelle, i and few other folks have started sketching out a scope for the "croissant tasks" concept https://docs.google.com/document/d/1cQ2nQvP4WXyd2AaOmVZoO_URn7whv09PWeLeScrFzEQ/edit?usp=sharing (you might need to request access). if this gels w your ambition we could see if we find a way to push something forward together. an esoteric sketch attached Screenshot from 2024-09-26 22-36-42

3) we have been planning to hold a small composable data systems workshop in paris (before neurips or in the new year). this could be a good opportunity to explore some of these themes inperson indetail