Chocobozzz / PeerTube

ActivityPub-federated video streaming platform using P2P directly in your web browser
https://joinpeertube.org/
GNU Affero General Public License v3.0
13.1k stars 1.51k forks source link

Remote storage mounted via FUSE, Amazon S3, SFTP and more #147

Closed ghost closed 4 years ago

ghost commented 6 years ago

UPDATED by @rigelk:

There are two ways to see remote storage:

  1. the instance admin adds remote storage via the admin interface. Arguably he can already do that by hand if he has access to the hosted machine, but that's another story.
  2. the registered user adds their credentials to remote storage via their user interface. This way they can increase their storage capacity independently of their own quota.

For those who have followed the discussions about IPFS, the problem is not exactly the same. Here it is simpler instance, since we do not need to consider each video/group of videos nor its integration with the client to really benefit from it. Here we just consider the remote storage as a bucket sitting at the disposal of the PeerTube server.


Libraries we should look at:

We should also consider adding a caching system, since most of these backends are slow by nature (a.k.a. cold storage):

Now, I would be cautious about pricing models of some services supported. Disabling them by default, making a huge banner saying "here be dragons" before activating them, would be the least we can do.

ghost commented 6 years ago

@Chocobozzz are you planning S3, S3 compatible storage support?

Chocobozzz commented 6 years ago

No, not now sorry.

Chocobozzz commented 6 years ago

Maybe we could forbid users to upload other videos if the disk is x% full for example.

You have user video quota too (already implemented).

ghost commented 6 years ago

hm okay~ thanks!

cjeanneret commented 6 years ago

Object Storage support would be really convenient: cheap, elastic, reliable (well, should be at least ;)).

Note: we might use FUSE layer in order to "mount" a bucket and access it like we would do for a standard storage, although it's slow, and might create some issues regarding cost for API calls…

julienfastre commented 6 years ago

We are discussing with some content provider to make them switch to peertube. If we do it, we would like to use our docker swarm infrastructure.

In this case, storing the file on Openstack Swift or Amazon S3 would be a real ease.

Another feature linked to this would be to return files directly from the Openstack Swift container, or eventually amazon S3, using temporary URL. This would reduce the need for caching and would reduce the need of a correct bandwith infrastructure (event if the nature of this project will reduce this need).

Would this be doable easily ?

rigelk commented 6 years ago

@julienfastre integration of temporary URL is non-trivial as we should change not only the backend but also how the client works too ; but the bandwidth reduction is indeed drastic. I guess the closest package to generate temporary URLs would be pkgcloud.

julienfastre commented 6 years ago

I wonder if returning a 307 http code would not suffice to redirect client automatically.

I think I already have this behaviour with the docker registry (I think docker registry's openstack swift storage driver redirect an HTTP 307 response when configured accordingly.

But this should be double-checked.

rigelk commented 6 years ago

@julienfastre what do you want to redirect? It's not like we will redirect the whole view directly to the temporary url… it has to be fed in the player.

EDIT: after @Chocobozzz explained it could be a redirection on the WebSeed, I get it. But generating the WebSeed dynamically for each request to it?

Chocobozzz commented 6 years ago

@julienfastre I'm sorry but I don't have any knowledge of Swift. A HTTP redirection could be a good idea, but I don't know if the webseed implementations of torrent libs follow redirections. You could test it :)

For the object storage (like S3), I'm not convinced that implementing it directly in PeerTube is a good idea because streaming videos requires a big bandwidth (-> poor performance and significant costs).

I think using something like goofys with a cache is a better way to do this. But I never tested it (or another software that does the same thing, ie mounting a bucket directly on filesystem with a cache), so feedbacks are very welcome :)

McFlat commented 6 years ago

I would recommend using a S3 compatible alternative like minio.io or wasabi.com, I know that mastodon allows to store files in wasabi, minio.io or S3. So it would be nice to cut down costs of storage.

screen shot 2018-10-10 at 11 27 07 am screen shot 2018-10-10 at 11 28 12 am screen shot 2018-10-10 at 11 28 38 am

Wasabi also is 6x faster than S3 so it's a great alternative to S3 since speed is a concern here. For my instance I'm using EFS on AWS and the cost is growing quickly so I'm looking for another solution for storing all these videos, wasabi seems to be a great alternative. https://wasabi.com/s3-compatible-cloud-storage/

McFlat commented 6 years ago

I only have nearly 1TB of videos currently uploaded to my instance, but I'm looking at having close to 80TB of videos and so the price will be too much for me to afford when it reaches 80TB, but if I switch to wasabi.com, then it's more affordable, plus the speed of wasabi is really snappy.

rigelk commented 6 years ago

@McFlat in the meantime, you might want to try https://wasabi-support.zendesk.com/hc/en-us/articles/115001744651-How-do-I-use-S3FS-with-Wasabi-

sunjam commented 6 years ago

Webdav integration is great since it is supported on all operating systems, plus Nextcloud, Owncloud, Pydio and Seafile. The Nextcloud team @rullzer and @schiessle can be a good resource with WebDAV stuff. Their IRC channel is at #nextcloud-dev.

tcitworld commented 6 years ago

If I'm not mistaken, the issue with WebDAV is that you need to download the whole file before serving it, which would be quite stupid here. https://github.com/nextcloud/server/pull/9178 may be a solution here, but it's Nextcloud specific, and would require an adapter.

McFlat commented 6 years ago

Not to mention WebDAV is riddled with security issues https://www.networkworld.com/article/2202909/network-security/-webdav-is-bad---says-security-researcher.html

sunjam commented 6 years ago

That short article is from 2011 and the bugs it briefly mentions are from 2010. There is no link to any actual report, and the links all 404.

ROBERT-MCDOWELL commented 6 years ago

sunjam some A.I fake accounts are growing on github

Nutomic commented 6 years ago

Any hints where I'd have to start to implement this? Specifically, I want to have the option as an admin to move all the video files to S3 compatible storage. Thumbnails, previews etc could also be moved there in another step.

edit: ping @Chocobozzz @rigelk

Chocobozzz commented 6 years ago

Any hints where I'd have to start to implement this?

Did you test the libraries linked in the first post?

Nutomic commented 6 years ago

I didn't test them, but here's what I think based on their descriptions:

There is also aws-sdk-js, but it mainly supports Amazon S3 itself, which is very expensive.

I think pkgcloud is by far the best option, as it would make configuration easy (no need to setup a filesystem mount). It would also help us to offload other things like transcoding to different servers.

techknowlogick commented 5 years ago

I've tested using goofys, and while it handles serving files just fine it runs into problems uploading files (and transcoding them into different resolutions).

I agree with @Nutomic's assessment that pkgcloud is the best option.

witcheslive commented 5 years ago

Not having this ability is basically what's between me and setting up a peertube instance, local storage on a VPS is way too expensive if anyone, gods forbid, actually uses it, but wasabi is cheap and fast as hell (when it bothers to be up 🤣 ) and I wouldn't have to worry about bandwidth usage whatsoever because wasabi doesn't charge for it

Someone mentioned to me using fuse when I brought this up and while it would solve the cheap storage problem, it wouldn't solve the bandwidth/cdn problem because the instance server would still act as a media proxy in this case. Unlike how I have my mastodon server witches.live configured with a media proxy, I think it'd be best here to go directly to wasabi buckets to be as fast (and cheap) as possible

shleeable commented 5 years ago

Hey, I really would love to see s3 or similar supported... my traffic/disk usage is definitely of concern with the other services and I'd love to see this migrated...

How is this looking?

libertysoft3 commented 5 years ago

I've got a fork going which supports having videos as a s3fs mounted directory. The trick is you have to transcode locally, and then copy to s3. https://github.com/libertysoft3/PeerTube

Chocobozzz commented 5 years ago

@libertysoft3 It's not a bad idea. PR welcome :)

Serkan-devel commented 5 years ago

Wouldn't that be a proprietary integration into this FOSS platform?

ldidry commented 5 years ago

@Serkan-devel You can have a S3 storage with Ceph or Minio, which are open-source, so no.

Serkan-devel commented 5 years ago

Ok, fair enough

libertysoft3 commented 5 years ago

@Chocobozzz alright here's a pull request: https://github.com/Chocobozzz/PeerTube/pull/1810 This is slightly better than my previous attempt which introduced a new storage directory for transcoding.

Anyway this theoretically provides full "manually mounted" support for s3fs, goofyfs, or whatever.

Nutomic commented 5 years ago

This issue can be closed, right?

Chocobozzz commented 5 years ago

@Nutomic Yep, but I just want to add a section in the documentation first

Chocobozzz commented 4 years ago

Added a small documentation in https://framagit.org/framasoft/peertube/documentation/commit/f4c5c55632b2bbb5436541046f46ba0882842167

tilllt commented 4 years ago

Did some tests with Backblaze B2 Storage, using Min.io as a B2-S3 gateway... couldnt really do anything useful with it since i pretty soon ran into the limits of the free trial account on Backblaze, specifically the Class C Transaction Caps. I am sure there are a lot of things that could be optimized in the way i did things (to access the cloud storage less frequently) but as far as i can tell it seemed to work.

Extending your local storage with cloud storage if running out of space, in this case Backblaze B2 is used, but check min.io compatibility, you could use Amazon S3, Google Cloud Storage, Azure, NAS etc.

Create Backblaze B2 Account & Bucket

http://www.backblaze.com/b2

Install Minio (Docker)

minio_b2:\ image: minio/minio\ ports:\ - "9000:9000"\ volumes:\ - /media/b2_peertube:/data\ environment:\ MINIOACCESSKEY: my-backblazeb2-admin-key\ MINIOSECRETKEY: my-backblazeb2-secret-key \ command: gateway b2\ healthcheck:\ test: ["CMD", "curl", "-f", "http://localhost:9000/minio/health/live"]\ interval: 30s\ timeout: 20s\ retries: 3

Install Fuse

[https://en.wikipedia.org/wiki/Filesystem>*[in](https://en.wikipedia.org/wiki/Filesystem)*[Userspace](https://en.wikipedia.org/wiki/Filesystem)

Install S3fs

https://github.com/s3fs-fuse/s3fs-fuse

Save Minio / Backblaze Credentials:

echo "my-backblazeb2-admin-key:my-backblazeb2-secret-key" > /etc/s3cred
chmod 600 /etc/s3cred

Test Backblaze B2 > Minio > S3 > Fuse Mount

s3fs  BB-Eval /media/b2_peertube -o passwd_file=/etc/s3cred,use_path_request_style,url=http://127.0.0.1:9000

Check to see that the bucket is mounted with the mount command:

mount | grep s3fs  

s3fs on /media/b2_peertube type fuse.s3fs (rw,nosuid,nodev,relatime,user_id=0,group_id=0)

Use https://romanrm.net/mhddfs to combine local storage and cloud storage into one new storage space for Peertube

mkdir /media/peertube_combined
mhddfs /opt/docker-startup/peertube/docker-volume/data/,/media/b2_peertube/ /media/peertube_combined/

Change new Data-Directory for Peertube to be /media/peertube_combined

Restart Docker

mj-saunders commented 3 years ago

When following this: https://docs.joinpeertube.org/admin-remote-storage am I correct in saying that PeerTube can't host locally at the same time as using and s3 server? The bind mount clobbers the local video directory, so would I have to migrate any existing local content to the remote storage?

It didn't seem to warrant it's own issue, but I'll create one if necessary. Sorry if I cause any inconvenience.

tiotrom commented 3 years ago

Quick question: this means I can extend the storage of my peertube, or that I will change it? Because if I extend it, then it is fantastic, but if I would lose the already locally stored videos, then.... Cheers!

tilllt commented 3 years ago

Both is possible. https://github.com/trapexit/mergerfs

tiotrom commented 3 years ago

Superb!

normen commented 3 years ago

Has anyone gotten this to work properly with live streaming? When live streaming I'm uploading way more data than I receive to the S3, causing errors in writing the files properly. My peertube server is set to use hls only, no transcoding.

[tube.bitwaves.de:443] 2021-02-09 22:45:37.762 error: Cannot copy segment /data/streaming-playlists/hls/3f5c4a3c-dad3-4eda-99cb-aab03c3a8aea/0-000029.ts to repay directory. {      "err": {                                                                                                                                                                            "stack": "Error: EIO: i/o error, close",
    "message": "EIO: i/o error, close",                                                                                                                                               "errno": -5,                                                                                                                                                                      "code": "EIO",
    "syscall": "close"                                                                                                                                                              }
}

I think the problem with this is that sequential writes to the same file are killing the upload capacity. Each time data is appended to a .ts file the whole file is uploaded again (with s3fs).

@Chocobozzz Any chance to increase the in-memory buffer for these chunks so they can be written in one piece? For my setup with OBS the resulting ts fragments are about 700kB each, being written in 200kB chunks, causing the file to be uploaded 4 times. I guess the final creation of the combined file is a whole different thing, however I don't get errors there.

Edit: Looking at the whole issue again, the main problem is that the combined file is generated in the same s3 folder, when saving the video (publish after recording). Without saving theres no errors in the backend but I do get these quite often in the web player:

[Error] VIDEOJS: – "ERROR:" – "(CODE:2 MEDIA_ERR_NETWORK)" – "HLS.js error: networkError - fatal: true - manifestParsingError" – ke {code: 2, message: "HLS.js error: networkError - fatal: true - manifestParsingError", status: null, …}
ke {code: 2, message: "HLS.js error: networkError - fatal: true - manifestParsingError", status: null, MEDIA_ERR_CUSTOM: 0, MEDIA_ERR_ABORTED: 1, …}kecode: 2message: "HLS.js error: networkError - fatal: true - manifestParsingError"ke Prototyp

Edit: Created an issue for this: https://github.com/Chocobozzz/PeerTube/issues/3735

gsugambit commented 2 years ago

Are there instructions online for making peer tube work with Google Cloud Storage? I can't seem to find an instruction anywhere. I'm trying to run peertube in docker and have the files retrieved/uploaded to GCS

tilllt commented 2 years ago

MinIO supports Google Cloud Storage and a wide variety of other providers and provides a standardized S3 interface for them.

https://docs.min.io/docs/minio-gateway-for-gcs.html

tio-trom commented 1 year ago

Isn't Webdav supporting now partial content delivery? Meaning one could stream a video from Webdav without downloading the entire file first. I find it way more reliable to use a WebDav mount point than Object Storage which I could not setup properly considering the ACL rules and such.