2020PB / police-brutality

Repository containing evidence of police brutality during the 2020 George Floyd protests
MIT License
2.62k stars 211 forks source link

All content needs to be rehosted #32

Open zimmertr opened 4 years ago

zimmertr commented 4 years ago

These videos, images, and other information should be downloaded and rehosted elsewhere in addition to posting the original source. Otherwise the content is at stake of removed.

The content should also be easily available for mass download. This will prevent the loss of the content in the event that this repository is removed. Perhaps using something like bittorent or a self hosted peer-to-peer synchronization program akin to Google Drive/Dropbox.

I'm happy to donate towards hosting fees if necessary.

2020PB commented 4 years ago

We could use some CI rules to auto archive the linked footage. We currently have a repo with most of the videos archived (link on the README) but we should definitely automate it.

zimmertr commented 4 years ago

Thanks for pointing out the other repo in the Readme. I admit I missed this when glossing over it. However, I am not sure this is a long term solution given that the contents are not hosted and published by a third party. Right now they are at the mercy of GitHub/Microsoft which has not necessarily acted favorably towards less legal repos in the past.

Not to indicate this is currently less than legal. But I also don't trust that illegality is a requirement for action anymore. And I think that Microsoft would take a stance against the public if under enough pressure.

2020PB commented 4 years ago

That's correct, however that repo is also archived on IPFS so if it goes down it will not be gone. Do you know if there is a way to have an external API verify that it is being called by the CI scripts on a github repo? I have an API we can upload files to for IPFS reupload, but I don't want it to get spammed when I add the token.

I will look into this tonight, but if you know of a good way to do this please lmk and I will try to set something up.

zimmertr commented 4 years ago

I'm sorry I don't. However, I do have infrastructure development skills if you need assistance building out a server/cloud infrastructure for this project.

2020PB commented 4 years ago

Awesome, I will let you know! I don't think it should be necessary because we already have a good resource for hosting stuff, but you never know what we'll need later!

hunterwilliams commented 4 years ago

@2020PB "Do you know if there is a way to have an external API verify that it is being called by the CI scripts on a github repo? I have an API we can upload files to for IPFS reupload, but I don't want it to get spammed when I add the token."

This should be a non-issue. You can attach the token in the CI server so that no one can read it back/see it assuming someone doesn't allow in a Pull Request that somehow makes it visible in CI logs. Note: I assume the host header could be checked as well.

@2020PB I've created this -> https://github.com/hunterwilliams/link-archival which can assist with archiving everything in an automated fashion. If there is a file structure etc you want that would be good to know. Needs a bit of process though to just drop in. Willing to assist. It can be used now though to download /screenshot some items (i haven't gotten through automating all of video downloads just twitter).

mjmaurer commented 4 years ago

Love the work being done here! To me, the ideal infra would be uploading images/vids to ipfs and using a P2P solution (gunjs, orbitdb) for storing structured JSON that links out to IPFS and includes additional metadata.

Edit: I think IPFS rehosting is more realistic for this repo as it exists

mjmaurer commented 4 years ago

I'd be happy to build in IPFS hosting, but I'd like #163 to be addressed. Rehosting on P2P makes it much more likely to always exist. I personally wouldn't want an image of me widely circulated without my consent.

bonedaddy commented 4 years ago

I've been archiving data onto IPFS, and have the following archived media:

Hosting data on IPFS may be a bit tricky if you want to be anonymous or not publicly be known as backing up the data. If you're using IPFS it is trivial at best to find out, and trace people hosting content. If you want anonymity or to not be identified as someone backing up the data, IPFS is not a good idea.

Latest archive

mjmaurer commented 4 years ago

As it is, someone gives up anonymity in the form of a PR

ubershmekel commented 4 years ago

@mjmaurer if you need anonymity - you can message the mods on reddit. Is that good enough?

valadect commented 4 years ago

Mentioned in the other issue but made a script that will download all the videos and also screenshot the webpages for posterity. But hopefully that helps with the ephemeral nature of the internet. Downloading all the links now so very much still a WIP but feel free to play around with it https://github.com/valadect/pbbackup

nathanfranke commented 4 years ago

Would it be illegal to post media files in the repo itself so that people can back them up simply by cloning the repo? Of course this could just be a secondary backup method. (And bear with me since I am new to this community).

valadect commented 4 years ago

@nathanfranke Considering no one is profiting from it and also the fact that it is for educational use it should be fine for the most part.

tgalopin commented 4 years ago

I'd be happy to provide a backup server in France if that's useful, to avoid US law and companies.

ubershmekel commented 4 years ago

@nathanfranke there's this repo which I think is dedicated to files: https://github.com/pb-files/pb-videos

Though we don't have a good way to link between the two repos yet.

bonedaddy commented 4 years ago

If you want to be able to mirror media locally, and optionally upload to an IPFS node checkout the downloader tool in the tools folder.

modelmat commented 4 years ago

Perhaps reposting on LBRY might be useful?

krmax44 commented 4 years ago

Maybe looking into Archive.org for hosting might be worth a shot. They offer an S3-like API to upload files: https://github.com/vmbrasseur/IAS3API#internet-archive-s3-api-documentation

DavidVorick commented 4 years ago

Hey guys, just learning about this project, happy to host everything on Skynet. Skynet is a platform similar to IPFS, except instead of seeding the files yourself, a decentralized platform called Sia (similar to what Filecoin is meant to be) seeds the files for you. You get uptime + decentralization without having to host anything yourself.

How can I get started?

Is there a chatroom somewhere? Some of this might be easier to do in real time. I've got questions like:

ubershmekel commented 4 years ago

I'm not a lawyer, but I hope this data would fall under "fair use" such as in a documentary: https://en.wikipedia.org/wiki/Fair_use#Documentary_films

nathanfranke commented 4 years ago

I would hope it is considered criticism or documentary since getting permission from all filmers would be effectively impossible.

xloem commented 4 years ago

Please link to the mirrors inside the repository so they can be found.

I have made a barebones sia-skynet remote for git-annex in https://github.com/xloem/gitlakepy (EDIT: fixed link) which should help skynet interoperate with git or datalad a little. I have also made a barebones bsv remote for git at https://github.com/xloem/git-remote-bsv providing for storing lightweight git repositories on a different storage-oriented blockchain than skynet. Unlike skynet bsv content cannot eventually be lost on the network.

karan commented 4 years ago

Absolutely need to do this.

What about using Internet Archive? If needed, I have a few TB of storage on my NAS I can donate on temporary basis as well.