bigscience-workshop / data_tooling

Tools for managing datasets for governance and training.
Apache License 2.0
79 stars 48 forks source link

Create dataset AOC #287

Open albertvillanova opened 3 years ago

albertvillanova commented 3 years ago

Source: Masader Project

cakiki commented 2 years ago

@albertvillanova

From a footnote in the paper: Data URL: http://cs.jhu.edu/˜ozaidan/RCLMT/ Link leads to a 404; not sure if the data is available anywhere.

The linked repo (does not belong to any of the authors) does not seem to be complete (paper mentions three data sources, I can only see two of those) and the data itself is just raw HTML; I don't think it's the data this issue mentions. Maybe someone else can corroborate.

apergo-ai commented 2 years ago

self-assign

apergo-ai commented 2 years ago

I contacted the author

albertvillanova commented 2 years ago

Thanks @cakiki and @apergo-ai. Any update from the author?