hplgit / doconce

Lightweight markup language - document once, include anywhere
http://hplgit.github.io/doconce/doc/web/index.html
Other
312 stars 60 forks source link

The repository is huge #8

Closed mardukbp closed 9 years ago

mardukbp commented 9 years ago

I am a newcomer to doconce and I am really eager to start using it. However, having to download 300+ MB in order to try the program is too much. du -h -d 1 reveals that more than half of that are git objects, which I believe to be unnecessary for most users. A large amount of space is taken up by files related to slide generation, which are not essential either. Also, I noticed an overlap in the content of the bundled and lib directories. The latter includes zip files of some of the directories in the former.

This comment is not intended to be a rant and I apologize if it sounds like it. I am very interested in using this tool and more importantly, I want my colleagues to use it.

hplgit commented 9 years ago

You are right that 1/4 of the space is taken up by a .pack file in the .git directory (if you know how to reduce that I would be thankful) and 1/4 by slide demonstrations. The latter files could of course be moved out of the repo (they are present since documentation and demos in general are part of the repo). The zip files you mention are quite small in the big picture and needed to avoid installing a large number of individual data files. So, it would be straightforward to reduce the size to 75%, but is 400 Mb that big deal these days? The software infrastructure needed to compile doconce to various formats is much bigger and must also be installed.

mardukbp commented 9 years ago

git gc --agressive --prune reduced the size of the pack file to 111 MB.

I view doconce as a relative of pandoc. What I dislike of pandoc is that with every new version I have to compile a whole bunch of Haskell libraries. In the end, the installed size is approx. 130 MB. A basic LaTeX installation is less than 100 MB. In my opinion, installing doconce in its current state feels like installing an operating system. The installation instructions are daunting. There are too many dependencies. I would suggest to separate the core of doconce from the outer layers and let the user install the programs, style files, etc. as required. It is more or less how python, LaTeX and julia work.

hplgit commented 9 years ago

I run the git gc command you list regularly, but it does not seem to affect the size of the .git directory of a new checkout (?).

The core of doconce is the .py files plus a collection of style files (this collection is not big, about 6Mb). Note that a lot of the style files are modified (e.g., you can change slide style without significant changes in fonts size and layout as the original versions would lead to).

The big part of a doconce checkout is the doc directory. It can relatively easily be moved to a separate repo. With a permanent delete the .git directory will also be significantly reduced in size. However, I need some more arguments why the size is really an issue - it seems that the size of the installation is a greater issue.

Regarding installation, what you install of software is completely up to the user. I offered the install_doconce.sh script just because I got too many emails asking what was wrong when some demo failed. Most users want a complete environment for simplicity, since disk space and downloading time are not that prohibitive today, but if you think this collection of potentially needed software is too big and just want one or two output formats, edit the installation script or install files manually as you need them. It is not sensible to reduce the dependencies since any dependency is connected to a feature. And all features do demand a huge software collection.

(BTW, when you recommend Python for scientific computing as an alternative to Matlab, you don't recommend to to install plain Python and the packages one would need, you typically point users to a big bundle like Anaconda where users get far too much, but you avoid all the frustrations and show stoppers associated with a manual install.)

mardukbp commented 9 years ago

As an example, I just downloaded org-mode's git repo. It weighs 66 MB. I believe that org-mode and doconce are quite similar in capabilities. Due to a lack of better arguments, I will close this issue.