instructlab / community

InstructLab Community wide collaboration space including contributing, security, code of conduct, etc
Apache License 2.0
72 stars 45 forks source link

GATING - open an issue on proper citation of data sources longer term #142

Open lhawthorn opened 6 months ago

lhawthorn commented 6 months ago

tl;dr - properly citing data used to generate the model is difficult and determining how to do so in an automated way has not yet been thought through. not a known industry standard for this. need place for community to discuss this idea and notify when they think data source not properly cited/attributed

lhawthorn commented 6 months ago

Maybe this does not look like an issue, maybe it is a GitHub discussion. Think about this and discuss with other project members.

jjasghar commented 1 month ago

This is still an issue that we need to figure out going forward. There is an opportunity to automate this, but it's still ongoing until we have an answer.