ManimCommunity / manim

A community-maintained Python framework for creating mathematical animations.
https://www.manim.community
MIT License
24.94k stars 1.75k forks source link

New repository for documentation examples #308

Closed leotrs closed 4 years ago

leotrs commented 4 years ago

Forking the discussion that started in #297 to here.

The problem

In the documentation, there will eventually be a (hopefully) substantial number of examples, both code and output videos/gifs. Git is pretty bad at tracking binary files, and if we keep these in the main repository, it will bloat up and become very clunky. Further, not every user/developer that clones the repository will need to (or want to) interact with the example gifs/movies.

The solution

@kolibril13 suggested that we create a new repository under the ManimCommunity organization. This repo would contain the gif/video files to be included in the documentation, but not the documentation itself. The documentation itself will remain under the main repository. Further, @naveen521kk brought up the fact that using GitHub for hosting is fine, but we should use other solutions for delivery and suggested jsdelivr. The benefits are that it will be much faster, and will be up whenever github is down.

What we need from you

Adding a new repository to the ManimCommunity organization is a fairly big change to the face of the organization, and I would not want to do that without making sure the core team approves. Up to now we have four people in favor (@Aathish04, @naveen521kk, @kolibril13, and myself). We need to hear from at least few more among @kilacoda @yoshiask @eulertour @PgBiel @XorUnison @kilacoda @huguesdevimeux @safinsingh @faielgila. Please feel free to thumbs up or down and/or raise any issues that you want to discuss.

PS: sorry for the multiple mentions in the other issue and this one. I just can't bring myself to make executive decisions on my own without running it by the whole team!

eulertour commented 4 years ago

The maximum file size on GitHub is 100MB so storing video in a repository probably won't last very long, especially if we use high quality video. We may end up having to host it ourselves somehow.

naveen521kk commented 4 years ago

I will consider youtube or something like that for videos, but for images and gifs a repo would be useful.

safinsingh commented 4 years ago

Would it be more useful for one of us to create a website, I could help with that for sure. That way we can have it all in one easily accessible place that doesn't have like a file storage limit and if we really need to we can host the files on a separate platform

huguesdevimeux commented 4 years ago

What are the other options except having a GitHub repo? I don't see any issue in having such thing.

The maximum file size on GitHub is 100MB so storing video in a repository probably won't last very long, especially if we use high quality video. We may end up having to host it ourselves somehow.

This is fine for all the quick example, 100MB is enough. For more long and complicated example, I think we can use Youtube. But the problem with Youtube is that it can't be accessed by everyone without doing some weird password-share syste.

kolibril13 commented 4 years ago

This is fine for all the quick example, 100MB is enough. For more long and complicated example, I think we can use Youtube

Good idea! It did not occur often to me to have file sizes bigger than 50 mb, even for long and complicated scenes, so I think this will not occur to often.

Another thing we should consider is the maximum total size of the repo. I could only find an old answer (from 2017) here, they say that there is no hard border, but you get a notification from github when your repo exceeds 1 GB.

Adding a new repository to the ManimCommunity organization is a fairly big change to the face of the organization, and I would not want to do that without making sure the core team approves

To keep the current face of the organization: we could make this doc-files repo a private repo and try if jsdelivr can somehow get access to that.

naveen521kk commented 4 years ago

To keep the current face of the organization: we could make this doc-files repo a private repo and try if jsdelivr can somehow get access to that.

Jsdelivr fetches only from public repositories but yeah it caches it forever. They say so, https://www.jsdelivr.com/features

We use a permanent S3 storage to ensure all files remain available even if GitHub goes down, or a repository or a release is deleted by its author. Files are fetched directly from GitHub only the first time, or when S3 goes down.

leotrs commented 4 years ago

I think the examples gallery will only rarely contain examples larger than 100MB. If that ever happens, we can use YouTube for those few. The rest can go in a repo.

The repo should be public, because open source! But also because I think private repos are not free (as in they cost money).

Not sure if it's necessary to create a whole separate site for this though? I'd much rather just host the documentation in RTD, until we're ready to have s fully fledged actual website for the project.

kolibril13 commented 4 years ago

private repos are not free (as in they cost money).

They changed this. Private repos are also free now: https://github.com/pricing But you are right, it would be better to have this open-source. I feel comfortable with the idea to have a public second repo for img and gifs.

PgBiel commented 4 years ago

A repo for images and stuff sounds ok to me. But for videos that exceed a certain size, we might see ourselves forced to use YouTube, as was said here already.

yoshiask commented 4 years ago

I'd just like to point out that GitHub can store files larger than 100MB. It's still free to use and everything, but you have to set up GitHub LFS (Large File System) which is a bit of a pain

leotrs commented 4 years ago

How long a video is a 100MB file at high quality?

How many hours of video could we store in a barebones GitHub repository?

eulertour commented 4 years ago

Even if we do find a way to do it with github, I'd much rather use something that's designed for file storage to store videos and gifs.

kolibril13 commented 4 years ago

How long a video is a 100MB file at high quality?

I just made a test with this script:

class TestExample(Scene):
    def construct(self):
        s=Square()
        c= Circle()
        for i in range(0,60):
            self.play(Transform(s.copy(),c.copy()))
            self.play(Transform(c.copy(),s.copy()))

The video is 1080p 60fps and 2 minutes long -> TestExample.mp4 has a size of 6.2 MB -> TestExample.gif has a size of 52.8 MB 

leotrs commented 4 years ago

Holy crap what a difference. So I don't think we'll have many examples that are longer than 2min. I'm assuming that examples will be short and succinct. BUT, a single 2min example occupying 50MB is kind of ridiculous, and the git history is going to explode.

@eulertour what do you suggest we use instead?

Aathish04 commented 4 years ago

BUT, a single 2min example occupying 50MB is kind of ridiculous, and the git history is going to explode.

I think it's worth noting that the video was rendered in 1080p at 60FPS. I don't think we need to have every single example animation rendered at such a high quality.

leotrs commented 4 years ago

Fair. But still, now I'm on the fence.

kolibril13 commented 4 years ago

git history is going to explode.

We could write a bot that cleans the git history from time to time. I think version control is not so important in the case of videos.

But if there is another possible solution (maybe from eulertour), I would prefer that as well.

leotrs commented 4 years ago

If we don't put it in version control, then we can't use GitHub (or GitHub + jsdelivr) as hosting.

I think we should just use YouTube and share a password :shrug:

huguesdevimeux commented 4 years ago

Using Youtube would be to me a waste of time. Like, if you want to add/update/whatever scene example you will have to ask for the password, connect it, put remove the old video, replace, and put the new link.

I think that with something more collaborative-oriented, such as GitHub, doing something more automatized would be possible.

leotrs commented 4 years ago

I mean we could just set the password once and share it with all devs, no need to ask every time. But yes I see your point and I agree it could get old really fast.

Aathish04 commented 4 years ago

I mean we could just set the password once and share it with all devs, no need to ask every time.

Google hates it when people on opposite sides of the planet log into the same account, anti-hacking measures and all. I'd much rather have just a single person handle everything Youtube related if we go that route.

huguesdevimeux commented 4 years ago

Google hates it when people on opposite sides of the planet log

Very, very good point. 100% the account will get suspended/asked for mail verification ten times per day.

naveen521kk commented 4 years ago

Very, very good point. 100% the account will get suspended/asked for mail verification ten times per day.

I and @Aathish04 had some experience and it doesn't allowed me to login again.

leotrs commented 4 years ago

Ok, fair. So YT is a huge chore, and GH is not great for our purpose. What else is there?

naveen521kk commented 4 years ago

How about cloudinary?

leotrs commented 4 years ago

So we basically need a file system that lives in the cloud, with at least a few GB in storage, for free, with rapid distribution/CDN, and that allows any one of us to easily authenticate from around the world.

Does cloudinary offer this? Why use it instead of other options? I'm always a bit weary of asking everyone to sign on / register for yet another app/service/solution. A cloud bucket could also serve the same purpose, for example.

Aathish04 commented 4 years ago

cloud bucket could also serve the same purpose, for example.

I mean, I've got a 20GB IBM Cloud bucket on a free plan. I could set up some API keys for y'all and whatnot, but just too much of a hassle I think. Open to discussion though.

leotrs commented 4 years ago

Another thing to think about: how are we going to accept contributions by other members of the community? We won't be giving out keys/passwords to literally anyone, so we will have to do something like this:

  1. ask them to upload their example gif somewhere (say imgur)
  2. open a PR containing their example code, documentation, and a link to imgur
  3. and then one of us is going to have to take the file from imgur and then put it in YT/GH/cloud/bucket/etc

Sounds like a hassle to me

leotrs commented 4 years ago

At this point, I say we keep everything in imgur.

  1. This means we don't need to care about authentication/passwords/keys
  2. imgur is specialized in serving images and I've never ever heard it being down
  3. imgur supports both mp4 and gif
  4. imgur maximum file size for animated images is 200MB
  5. the only limit I can find is 60 seconds in length, which sounds fine to me since we are talking about an examples gallery

Additionally, we should also keep a file in the documentation that is a manually-curated list to each example URL. A simple csv will do:

name,path,url
example1,examples/example1.rst,https://imgur.com/some_url_here
example2,examples/example2.rst,https://imgur.com/some_url_here
example3,examples/example3.rst,https://imgur.com/some_url_here

So whenever anyone (devs or users) contributes or adds a new example, they have to provide the example, the page in the documentation where it is linked to, as well as the imgur URL. (This list could also be generated automatically, even.)

Thoughts?

naveen521kk commented 4 years ago

If not GitHub repo, if we use cloudinary or any storage bucket, what can be done is set up a GitHub Action which check for images(The contributor should be using Imgur or something similar) in documentation and upload to cloudinary/bucket and then asks the user to replace the existing images with those. Maybe, if that is hard, we can maintain a CSV file with images as mentioned above above and ask the contributors to edit that for images to be uploaded and we can use it to upload( Note: it should be done once docs are merged before deploying). Also what do other opensource do for this? Also readthedocs themselves doesn't host images?

leotrs commented 4 years ago

Maybe, if that is hard

I don't think it's hard, it's just a bunch of extra steps for the contributor. It may discourage contributions. We want to make contributing as easy as possible, especially because manim tends to attract users that are more experienced in math than in programming.

Also what do other opensource do for this?

No idea. What other movie/animation-heavy software do we know of?

Also readthedocs themselves doesn't host images?

Everything RTD does is pull a GH repo and build/host the documentation. So that would require us to keep everything under version control which is what we are trying to avoid.

kolibril13 commented 4 years ago

It may discourage contributions

There I am total with you! Once there was a really cool project from flipdot to collect manim examples: https://manim.flipdot.org/ But there, the contribution was also very difficult, so it did not grow a lot.

Maybe, if that is hard, we can maintain a CSV file with images

I would try to keep things as simple as possible with images. And when it comes to structuring, in my opinion, git is the best option.

Also, we could think of a script that searches for manim code in markdown files, puts them somewhere, and then inserts them at the right place on the website (Like #285). I could ask the https://manim.flipdot.org/ people, maybe they have an idea for us.

Or we ask the github-employes, there is some information about storage and a contact form linked here: https://docs.github.com/en/github/managing-large-files/about-storage-and-bandwidth-usage Maybe they like our project and give us resources to something we might be able to use for this purpose.

leotrs commented 4 years ago

And when it comes to structuring, in my opinion, git is the best option.

Also, we could think of a script that searches for manim code in markdown files, puts them somewhere, and then inserts them at the right place on the website (Like #285).

This discussion is about where to store the image files, not how to produce them or include them in the documentation. The main point here is that git will become bloated really quickly.

kolibril13 commented 4 years ago

This discussion is about where to store the image files, not how to produce them or include them in the documentation

I think automated creation is very related to the contribution of scripts, which was also appealed to this issue. But we can shift this to a sperate issue.

The main point here is that git will become bloated really quickly.

There are tools to delete large but deleted files, that are still sleeping in the repo. One of these tools is the bfg-repo-cleaner. When we run it from time to time, there are only the actual video files.

For testing if a repo can contain more then a few videos, I just uploaded 864 videos, each ~2MB to this repo https://github.com/ManimCommunity/manim-docs-files/tree/master/Lots_of_videos, and there was no complain from github. Here, they all have the same content but only different names, so git might compress them very efficiently. So I could also do this test again, but with video files that differ in content.

leotrs commented 4 years ago

Yes, git can be made leaner using tools like the one you mentioned. However, what would be the workflow for someone trying to contribute a new example?

  1. They have to clone two repositories, this one to write the documentation, and the other one to push the files
  2. They have to make two different PRs, probably at the same time, unless we want to allow arbitrary pushes to the manim-docs-files repo.
  3. Even if the above two points were somehow automatable using Actions or some other solution, we would still be asking each contributor to download the full repo of examples just so they can add a new one. So we might well be asking them to download 1GB (or more) of gifs, so they can push a 3s video.

I'm not loving this workflow.

eulertour commented 4 years ago

Isn't this the exact use case for gDrive/Dropbox? We can just create one of those with shared access.

P.S. Sorry I've been slow lately, work has been hectic

leotrs commented 4 years ago

Isn't this the exact use case for gDrive/Dropbox? We can just create one of those with shared access.

Not really gDrive, since the point was made above that it's a major hassle to access the same account from different places in the world. Dropbox maybe?

eulertour commented 4 years ago

If you're referring to https://github.com/ManimCommunity/manim/issues/308#issuecomment-674977975, the method of sharing described there is antipattern and the same thing would probably happen with dropbox.

The way to share files is to create the shared directory and configure it so that people with different accounts can access it, not share a single account between multiple people.

leotrs commented 4 years ago

Oh snap you've solved it

naveen521kk commented 4 years ago

Question: Does gDrive and Dropbox support embedding? I don't think so. If we were to use gDrive space then we may need to use blogger which also is hassle as it is not possible for shared account.

eulertour commented 4 years ago

According to this random article it seems possible.

naveen521kk commented 4 years ago

That's ok for videos what about a gif or image?

Aathish04 commented 4 years ago

It does seem to be possible, given the instructions here. It's not as simple as the video embedding though.

kolibril13 commented 4 years ago

For gifs and images: It would be cool if they can be accessed by an URL that looks just like the folder structure. Then we could use markdown and insert the images there. I just tried it with the owncloud: image alt Only problem: the owncloud has links like :lNiKxHTlL5xTiDz

naveen521kk commented 4 years ago

It would be cool if they can be accessed by an URL that looks just like the folder structure.

Only possible if we something like GitHub with jsdelivr.

See the link below of what you uploaded in other repo. https://cdn.jsdelivr.net/gh/ManimCommunity/manim-docs-files@master/Lots_of_videos/Test008.mp4

leotrs commented 4 years ago

It would be cool if they can be accessed by an URL that looks just like the folder structure

Cool, yes, but not necessary.

Then we could use markdown and insert the images there.

You can use markdown to insert any image, regardless of whether or not "the URL looks just like the folder structure".

leotrs commented 4 years ago

So it seems like the easiest option is to get a folder on gDrive or Dropbox and give read/write permissions to each dev. No password sharing necessary. Are people ok with this?

kolibril13 commented 4 years ago

I just created this folder structure:

.
└── Shapes-Geometrie
    ├── ShapeExample1.png
    ├── ShapeExample2.png
    ├── ShapeExample3.png
    ├── ShapeExample4.png
    └── subfolder
        └── image.jpg

into these 4 cloud services:

Now, I want to access them easily from a markdown file, and use paths e.g. like "someurl/Shapes-Geometrie/subfolder/image.jpg" In Github, this is very easy and convenient: https://github.com/kolibril13/manim-snippets/blob/master/Shapes-Geometrie/subfolder/image.jpg In the owncloud, that is also more or less ok, there a download link looks like this: https://owncloud.gwdg.de/index.php/s/Wmek3AEQNb4cYVt/downloads/path=%2FShapes-Geometrie&files=ShapeExample1.png

But in google and dropbox, I only get nonsense.

Any ideas?

leotrs commented 4 years ago

Users reading our documentation will (almost) never access an image/video/gif by URL directly. So I'm much less concerned about what the URL for the file looks like. (The URLs for the documentation can be handled through sphinx.)

I think github is out of the question, as per the comments in this thread. Also, we haven't discussed owncloud in this thread, and I'm not sure I want to sign up to yet another service just because the URLs look better.

I'm open to ideas here, but right now I'm leaning strongly for gDrive or dropbox.

eulertour commented 4 years ago

I was able to embed images in sphinx from the google drive using ![circle]http://drive.google.com/uc?export=view&id=1pbkJt0nehMe5DZBeEEr2b5fYxCagEayw by following the instructions in this comment, although I don't think recommonmark allows for image resizing.