Open WNYmathGuy opened 4 years ago
This is a good idea, however for us to be able to implement this in Friendica, we would need an existing database of known illegal content, and a way to query it. We are already doing it for exposed passwords, but even after a cursory search I'm not aware of a similar free or open service to detect child pornography or copyrighted content.
Question is if there is any database like this. Obviously we cannot provide such a database by ourselves ...
I kept looking and I found one in the UK but you have to be a paid member of this non-profit organization. Nothing freely accessible.
About halfway through the podcast I linked in my I.P. the interviewee discusses a Microsoft database of that type.
Microsoft has a system called PhotoDNA which fingerprints pictures and videos and allows to find matches between them. It just is the query part of the solution I outlined above, we still need a readily available database to query. Not the images/videos themselves, of course, but the hashes produced by PhotoDNA. This is what the UK non-profit is providing.
"Microsoft has a system called PhotoDNA
"
I must have wrongfully thought that was openly searchable.
Well, I tried accessing the PhotoDNA website using my Microsoft/Skype credentials, but the website errors out with a nasty 500 error. It seems PhotoDNA is meant to combat child pornography, and it's free for qualifying customers, which I have not been able to figure if we can offer it as an addon because the website crashes. Even on Microsoft Edge.
There is an available FAQ for the service that doesn't help figuring out if node admins could qualify as free users: https://www.microsoft.com/en-us/PhotoDNA/FAQ
I gave a whirl at contacting PhotoDNA too. I received a reply from them yesterday and I'm replying to them to see what they say. They may show up here to discuss it more for all I know.
Expected behavior
Have a way for admins to be notified if their server contains copyrighted photos or material evidence of a crime.
Additional background
Sam Harris dropped a podcast very recently and I'm a bit freaked out by it. #213 - THE WORST EPIDEMIC During the interview, they discussed the ability to match a hash function on an image to a table of hashes identifying child abuse and sexual exploitation imagery. It wouldn't be too hard to have the same hash function as a field in the
photos
table and maybe a copy of the table of known hashes of problematic imagery. That way an admin could check by the way of their Friendica Interface to see if a join on the two tables had any results identifying a given user. The admin could take whatever actions they deemed appropriate at that time. In the interview, they indicate that 3%-5% of people on the earth (uniformly distributed) are involved in the production or consumption of child abuse imagery. That means I might have 13,000 images on either of my two instances and have no idea they are there or any way of properly removing them.Actual behavior
I don't see any way to protect and manage what goes on my hard drives.
Steps to reproduce the problem
photos
table.phpMMyAdmin
and lose the ability to browse the table because it's taking a wicked long time to sort.Friendica version you encountered the problem
2020.07 with database version is 1355, the post update version is 1350.
Friendica source (git, zip)
git
PHP version
7.0.33
SQL version
MariaDB 10.5.4