Closed Benjamin-Lee closed 5 years ago
I actually agree with Casey's suggestion. I suspect that some generic DL problems do have significant ethical/legal/etc ramifications, but have not been associated with these issues yet. Granted, many of these could be much smaller (in terms of potential damage or degree of disruption) than the those faced in biomedical applications. I realize that this is a somewhat minor point, but I think it is important to keep it in mind.
Hm, I think that replacing "cats" by "handwritten digits" is maybe not too bad but then it kind of loses the "punchline" a bit, and I think this sentence might then sound a bit awkward to a person who is new to the field. I.e., one might would wonder, "handwritten digits? Where does that come from now?" whereas for "images of cats from the internet," it more naturally indicates that it is a placeholder for some hobbyist task.
I.e., instead of saying "Fearing a rise of killer robots is like worrying about overpopulation on Mars" one could say "Fearing a rise of killer robots is like worrying about overpopulation in North Dakota" -- same point but a bit less obvious to someone not from the US.
Since this is targeted at people who are not overly familiar to deep learning, I intended the "pictures of cats on the internet" reference to be a call out to the common reference to cats on the internet. From a practitioner point of view, I agree MNIST is likely the most common first challenge, but I feel it is not as accessible to a general audience.
Thoughts? I can change if we still feel it would be best to update.
plus, classifying cats is one of the most popular, classic DL examples (https://www.wired.com/2012/06/google-x-neural-network) so DL folks would probably also appreciate it as an easter egg ;).
@agitter @cgreene @Benjamin-Lee @evancofer What are your thoughts here? Should we change the sentence or keep as is. I'd like to close this out and you had asked for a change. I like the cat reference, but if you more prefer a different reference, I can update.
ctrl-f of the current manuscript for cats
doesn't return anything, so I think this has already been changed.
Oh - this was in #160 and I re-worked that sentence entirely. So yes, the cats are gone and apparently it's all my fault! 😼
Sounds good to me! Glad we can wrap up a couple of PRs and issues!
I realize that the PR is merged already, but we could conceivably add
While cat pictures from the internet [@url:https://www.wired.com/2012/06/google-x-neural-network/, @url:https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf] don't pose much of a danger if revealed, practitioners may encounter datasets that cannot be shared, such as ones for which there would be significant ethical or legal issues associated with release
I don't see what it adds, and even if this particular example does happen to be benign, I don't think it is helpful for the field if we assert that certain tasks are free of ethical concerns. Better to be vigilant even on tasks that seem whimsical or safe.
As @cgreene and @agitter have pointed out in #158, this sentence might not be the best:
@cgreene has proposed this as an alternative:
I personally think that the second half of the line (after the semicolon) is good but that the first part might be misconstrued. We could use MNIST instead of cat pictures as a generic DL task (and possibly add a footnote for those who don't know what MNIST is):