bigcode-project / opt-out-v2

Repository for opt-out requests.
7 stars 2 forks source link

Opt-out request for duaneking #10

Closed duaneking closed 2 months ago

duaneking commented 1 year ago

I request that ALL DATA affiliated with this username is removed from The Stack:

This includes but it not limited to:

I did not consent to have my work included and consider AI being trained on me without my consent to be worse than theft.

From my perspective as the license owner, as I did not expressly grant AI the right to use my code, I have not done so, so any attempt to use my code for AI is a violation of my more human focused terms.

duaneking commented 1 year ago

My biggest issue with this project is the violation of LICENSE;. I believe it to be a violation of the license terms of the projects that you have included in the stack for you to use them in the stack.

duaneking commented 1 year ago

And for the record, I don't believe that asking for forgiveness is better than asking for for permission here. You literally stole people's code without their consent.

cheld commented 9 months ago

@duaneking hello, I am not part of big code project. I am answering because, your statement is incorrect. Copyright does not include reading and processing data. As the name suggests, only copying. Literally speaking, you can go in a book store take out any book of the shelf, read it and put it back, leave the store and don't pay anything. Again reading and processing is not part of copyright and not covered by license. They don't need license. Even more, when you signed up for Github, then you accepted the Github EULA. This EULA explicitly allows the right to read public repositories.

You see, the licenses you have picked are irrelevant in this subject. (This may not be true when code is reprocuded without attribution later on in the generation process. But this is different topic)

duaneking commented 9 months ago

My intent is not to allow AI to read my code.

I am opting out.

I never consented to gave any AI the right to read the code.

I consider the use of my code without my consent to be theft.

You can either deal with the social implications of knowingly, going after people, and stealing their code, and intentionally doing so in violation of their personal request publicly, or you can be the good guy.

Your webpage says people can opt out and your website page says that you will respect people's request so that's what I'm expecting you to do.

I am answering because, your statement is incorrect.

No, its not.

Copyright does not include reading and processing data.

My license does not grant the right to read or process the data to non-humans, and requires that humans accept it to be allowed to use the code. In no place does my licenses grant the right to be used by AI on them. In no place in my license does it grant the right to be used by AI's.

Literally speaking, you can go in a book store take out any book of the shelf, read it and put it back, leave the store and don't pay anything.

I have worked in bookshops; The phrases you might be looking for to describe that and understand it better are "shopworn", and "moral hazard". Your statement is not actually true; only some shops do, when they have the license agreements to allow it. Its not binary.

Again reading and processing is not part of copyright and not covered by license. Exactly, I never granted these rights, so AI's do not have them. These rights must be explicitly not implicitly granted.

They don't need license. Yes they do. See above.

Even more, when you signed up for Github, then you accepted the Github EULA. This EULA explicitly allows the right to read public repositories.

That does not grant rights to code, however, for example I have public repos I worked on that are owned by MS that are not in the stack for obvious reasons; so it could be said that The Stack is just cherrypicking to steal content.

You see, the licenses you have picked are irrelevant in this subject.

No its not, and in fact the license is very important.

lvwerra commented 2 months ago

Your opt-out request has been processed and your data was removed in version v2.0.1 of The Stack and all future versions. Also your data was not used for the training of StarCoder2.

[PROCESSED]