Closed leotrs closed 4 years ago
The maximum file size on GitHub is 100MB so storing video in a repository probably won't last very long, especially if we use high quality video. We may end up having to host it ourselves somehow.
I will consider youtube or something like that for videos, but for images and gifs a repo would be useful.
Would it be more useful for one of us to create a website, I could help with that for sure. That way we can have it all in one easily accessible place that doesn't have like a file storage limit and if we really need to we can host the files on a separate platform
What are the other options except having a GitHub repo? I don't see any issue in having such thing.
The maximum file size on GitHub is 100MB so storing video in a repository probably won't last very long, especially if we use high quality video. We may end up having to host it ourselves somehow.
This is fine for all the quick example, 100MB is enough. For more long and complicated example, I think we can use Youtube. But the problem with Youtube is that it can't be accessed by everyone without doing some weird password-share syste.
This is fine for all the quick example, 100MB is enough. For more long and complicated example, I think we can use Youtube
Good idea! It did not occur often to me to have file sizes bigger than 50 mb, even for long and complicated scenes, so I think this will not occur to often.
Another thing we should consider is the maximum total size of the repo. I could only find an old answer (from 2017) here, they say that there is no hard border, but you get a notification from github when your repo exceeds 1 GB.
Adding a new repository to the ManimCommunity organization is a fairly big change to the face of the organization, and I would not want to do that without making sure the core team approves
To keep the current face of the organization: we could make this doc-files repo a private repo and try if jsdelivr can somehow get access to that.
To keep the current face of the organization: we could make this doc-files repo a private repo and try if jsdelivr can somehow get access to that.
Jsdelivr fetches only from public repositories but yeah it caches it forever. They say so, https://www.jsdelivr.com/features
We use a permanent S3 storage to ensure all files remain available even if GitHub goes down, or a repository or a release is deleted by its author. Files are fetched directly from GitHub only the first time, or when S3 goes down.
I think the examples gallery will only rarely contain examples larger than 100MB. If that ever happens, we can use YouTube for those few. The rest can go in a repo.
The repo should be public, because open source! But also because I think private repos are not free (as in they cost money).
Not sure if it's necessary to create a whole separate site for this though? I'd much rather just host the documentation in RTD, until we're ready to have s fully fledged actual website for the project.
private repos are not free (as in they cost money).
They changed this. Private repos are also free now: https://github.com/pricing But you are right, it would be better to have this open-source. I feel comfortable with the idea to have a public second repo for img and gifs.
A repo for images and stuff sounds ok to me. But for videos that exceed a certain size, we might see ourselves forced to use YouTube, as was said here already.
I'd just like to point out that GitHub can store files larger than 100MB. It's still free to use and everything, but you have to set up GitHub LFS (Large File System) which is a bit of a pain
How long a video is a 100MB file at high quality?
How many hours of video could we store in a barebones GitHub repository?
Even if we do find a way to do it with github, I'd much rather use something that's designed for file storage to store videos and gifs.
How long a video is a 100MB file at high quality?
I just made a test with this script:
class TestExample(Scene):
def construct(self):
s=Square()
c= Circle()
for i in range(0,60):
self.play(Transform(s.copy(),c.copy()))
self.play(Transform(c.copy(),s.copy()))
The video is 1080p 60fps and 2 minutes long -> TestExample.mp4 has a size of 6.2 MB -> TestExample.gif has a size of 52.8 MB
Holy crap what a difference. So I don't think we'll have many examples that are longer than 2min. I'm assuming that examples will be short and succinct. BUT, a single 2min example occupying 50MB is kind of ridiculous, and the git history is going to explode.
@eulertour what do you suggest we use instead?
BUT, a single 2min example occupying 50MB is kind of ridiculous, and the git history is going to explode.
I think it's worth noting that the video was rendered in 1080p at 60FPS. I don't think we need to have every single example animation rendered at such a high quality.
Fair. But still, now I'm on the fence.
git history is going to explode.
We could write a bot that cleans the git history from time to time. I think version control is not so important in the case of videos.
But if there is another possible solution (maybe from eulertour), I would prefer that as well.
If we don't put it in version control, then we can't use GitHub (or GitHub + jsdelivr) as hosting.
I think we should just use YouTube and share a password :shrug:
Using Youtube would be to me a waste of time. Like, if you want to add/update/whatever scene example you will have to ask for the password, connect it, put remove the old video, replace, and put the new link.
I think that with something more collaborative-oriented, such as GitHub, doing something more automatized would be possible.
I mean we could just set the password once and share it with all devs, no need to ask every time. But yes I see your point and I agree it could get old really fast.
I mean we could just set the password once and share it with all devs, no need to ask every time.
Google hates it when people on opposite sides of the planet log into the same account, anti-hacking measures and all. I'd much rather have just a single person handle everything Youtube related if we go that route.
Google hates it when people on opposite sides of the planet log
Very, very good point. 100% the account will get suspended/asked for mail verification ten times per day.
Very, very good point. 100% the account will get suspended/asked for mail verification ten times per day.
I and @Aathish04 had some experience and it doesn't allowed me to login again.
Ok, fair. So YT is a huge chore, and GH is not great for our purpose. What else is there?
How about cloudinary?
So we basically need a file system that lives in the cloud, with at least a few GB in storage, for free, with rapid distribution/CDN, and that allows any one of us to easily authenticate from around the world.
Does cloudinary offer this? Why use it instead of other options? I'm always a bit weary of asking everyone to sign on / register for yet another app/service/solution. A cloud bucket could also serve the same purpose, for example.
cloud bucket could also serve the same purpose, for example.
I mean, I've got a 20GB IBM Cloud bucket on a free plan. I could set up some API keys for y'all and whatnot, but just too much of a hassle I think. Open to discussion though.
Another thing to think about: how are we going to accept contributions by other members of the community? We won't be giving out keys/passwords to literally anyone, so we will have to do something like this:
Sounds like a hassle to me
At this point, I say we keep everything in imgur.
Additionally, we should also keep a file in the documentation that is a manually-curated list to each example URL. A simple csv will do:
name,path,url
example1,examples/example1.rst,https://imgur.com/some_url_here
example2,examples/example2.rst,https://imgur.com/some_url_here
example3,examples/example3.rst,https://imgur.com/some_url_here
So whenever anyone (devs or users) contributes or adds a new example, they have to provide the example, the page in the documentation where it is linked to, as well as the imgur URL. (This list could also be generated automatically, even.)
Thoughts?
If not GitHub repo, if we use cloudinary or any storage bucket, what can be done is set up a GitHub Action which check for images(The contributor should be using Imgur or something similar) in documentation and upload to cloudinary/bucket and then asks the user to replace the existing images with those. Maybe, if that is hard, we can maintain a CSV file with images as mentioned above above and ask the contributors to edit that for images to be uploaded and we can use it to upload( Note: it should be done once docs are merged before deploying). Also what do other opensource do for this? Also readthedocs themselves doesn't host images?
Maybe, if that is hard
I don't think it's hard, it's just a bunch of extra steps for the contributor. It may discourage contributions. We want to make contributing as easy as possible, especially because manim tends to attract users that are more experienced in math than in programming.
Also what do other opensource do for this?
No idea. What other movie/animation-heavy software do we know of?
Also readthedocs themselves doesn't host images?
Everything RTD does is pull a GH repo and build/host the documentation. So that would require us to keep everything under version control which is what we are trying to avoid.
It may discourage contributions
There I am total with you! Once there was a really cool project from flipdot to collect manim examples: https://manim.flipdot.org/ But there, the contribution was also very difficult, so it did not grow a lot.
Maybe, if that is hard, we can maintain a CSV file with images
I would try to keep things as simple as possible with images. And when it comes to structuring, in my opinion, git is the best option.
Also, we could think of a script that searches for manim code in markdown files, puts them somewhere, and then inserts them at the right place on the website (Like #285). I could ask the https://manim.flipdot.org/ people, maybe they have an idea for us.
Or we ask the github-employes, there is some information about storage and a contact form linked here: https://docs.github.com/en/github/managing-large-files/about-storage-and-bandwidth-usage Maybe they like our project and give us resources to something we might be able to use for this purpose.
And when it comes to structuring, in my opinion, git is the best option.
Also, we could think of a script that searches for manim code in markdown files, puts them somewhere, and then inserts them at the right place on the website (Like #285).
This discussion is about where to store the image files, not how to produce them or include them in the documentation. The main point here is that git will become bloated really quickly.
This discussion is about where to store the image files, not how to produce them or include them in the documentation
I think automated creation is very related to the contribution of scripts, which was also appealed to this issue. But we can shift this to a sperate issue.
The main point here is that git will become bloated really quickly.
There are tools to delete large but deleted files, that are still sleeping in the repo. One of these tools is the bfg-repo-cleaner. When we run it from time to time, there are only the actual video files.
For testing if a repo can contain more then a few videos, I just uploaded 864 videos, each ~2MB to this repo https://github.com/ManimCommunity/manim-docs-files/tree/master/Lots_of_videos, and there was no complain from github. Here, they all have the same content but only different names, so git might compress them very efficiently. So I could also do this test again, but with video files that differ in content.
Yes, git can be made leaner using tools like the one you mentioned. However, what would be the workflow for someone trying to contribute a new example?
I'm not loving this workflow.
Isn't this the exact use case for gDrive/Dropbox? We can just create one of those with shared access.
P.S. Sorry I've been slow lately, work has been hectic
Isn't this the exact use case for gDrive/Dropbox? We can just create one of those with shared access.
Not really gDrive, since the point was made above that it's a major hassle to access the same account from different places in the world. Dropbox maybe?
If you're referring to https://github.com/ManimCommunity/manim/issues/308#issuecomment-674977975, the method of sharing described there is antipattern and the same thing would probably happen with dropbox.
The way to share files is to create the shared directory and configure it so that people with different accounts can access it, not share a single account between multiple people.
Oh snap you've solved it
Question: Does gDrive and Dropbox support embedding? I don't think so. If we were to use gDrive space then we may need to use blogger which also is hassle as it is not possible for shared account.
According to this random article it seems possible.
That's ok for videos what about a gif or image?
It does seem to be possible, given the instructions here. It's not as simple as the video embedding though.
For gifs and images: It would be cool if they can be accessed by an URL that looks just like the folder structure. Then we could use markdown and insert the images there. I just tried it with the owncloud: Only problem: the owncloud has links like :lNiKxHTlL5xTiDz
It would be cool if they can be accessed by an URL that looks just like the folder structure.
Only possible if we something like GitHub with jsdelivr.
See the link below of what you uploaded in other repo. https://cdn.jsdelivr.net/gh/ManimCommunity/manim-docs-files@master/Lots_of_videos/Test008.mp4
It would be cool if they can be accessed by an URL that looks just like the folder structure
Cool, yes, but not necessary.
Then we could use markdown and insert the images there.
You can use markdown to insert any image, regardless of whether or not "the URL looks just like the folder structure".
So it seems like the easiest option is to get a folder on gDrive or Dropbox and give read/write permissions to each dev. No password sharing necessary. Are people ok with this?
I just created this folder structure:
.
└── Shapes-Geometrie
├── ShapeExample1.png
├── ShapeExample2.png
├── ShapeExample3.png
├── ShapeExample4.png
└── subfolder
└── image.jpg
into these 4 cloud services:
Now, I want to access them easily from a markdown file, and use paths e.g. like "someurl/Shapes-Geometrie/subfolder/image.jpg" In Github, this is very easy and convenient: https://github.com/kolibril13/manim-snippets/blob/master/Shapes-Geometrie/subfolder/image.jpg In the owncloud, that is also more or less ok, there a download link looks like this: https://owncloud.gwdg.de/index.php/s/Wmek3AEQNb4cYVt/downloads/path=%2FShapes-Geometrie&files=ShapeExample1.png
But in google and dropbox, I only get nonsense.
Any ideas?
Users reading our documentation will (almost) never access an image/video/gif by URL directly. So I'm much less concerned about what the URL for the file looks like. (The URLs for the documentation can be handled through sphinx.)
I think github is out of the question, as per the comments in this thread. Also, we haven't discussed owncloud in this thread, and I'm not sure I want to sign up to yet another service just because the URLs look better.
I'm open to ideas here, but right now I'm leaning strongly for gDrive or dropbox.
I was able to embed images in sphinx from the google drive using ![circle]http://drive.google.com/uc?export=view&id=1pbkJt0nehMe5DZBeEEr2b5fYxCagEayw
by following the instructions in this comment, although I don't think recommonmark allows for image resizing.
Forking the discussion that started in #297 to here.
The problem
In the documentation, there will eventually be a (hopefully) substantial number of examples, both code and output videos/gifs. Git is pretty bad at tracking binary files, and if we keep these in the main repository, it will bloat up and become very clunky. Further, not every user/developer that clones the repository will need to (or want to) interact with the example gifs/movies.
The solution
@kolibril13 suggested that we create a new repository under the ManimCommunity organization. This repo would contain the gif/video files to be included in the documentation, but not the documentation itself. The documentation itself will remain under the main repository. Further, @naveen521kk brought up the fact that using GitHub for hosting is fine, but we should use other solutions for delivery and suggested jsdelivr. The benefits are that it will be much faster, and will be up whenever github is down.
What we need from you
Adding a new repository to the ManimCommunity organization is a fairly big change to the face of the organization, and I would not want to do that without making sure the core team approves. Up to now we have four people in favor (@Aathish04, @naveen521kk, @kolibril13, and myself). We need to hear from at least few more among @kilacoda @yoshiask @eulertour @PgBiel @XorUnison @kilacoda @huguesdevimeux @safinsingh @faielgila. Please feel free to thumbs up or down and/or raise any issues that you want to discuss.
PS: sorry for the multiple mentions in the other issue and this one. I just can't bring myself to make executive decisions on my own without running it by the whole team!