dankelley / oce

R package for oceanographic processing
http://dankelley.github.io/oce/
GNU General Public License v3.0
143 stars 42 forks source link

vignette bloats package size hugely #862

Closed dankelley closed 8 years ago

dankelley commented 8 years ago

Jeeze. Now we have as below. The problem is that inst/doc now has oce.html which is 1.5Mb. I'm going to see if there is a way to produce only PDF. This all arises from the inclusion of a lot of new diagrams in the vignette. The problem is (in my mind) that's the whole point of a vignette.

Anyway, I have an actual bug to fix, on mapImage() so I am going to drop the vignette for a while.

It is possible that there is just not going to be a way to have lots of figures in the vignette. And, at that point, I am not really sure there is any point to the vignette and I'd vote to just delete it for this release.

* checking installed package size ... NOTE
  installed size is  5.2Mb
  sub-directories of 1Mb or more:
    doc    1.6Mb
    help   1.3Mb
richardsc commented 8 years ago

I think it's possible to specify image sizes/resolution when using knitr. That could reduce the size considerably from just using the defaults. I'll try to look into this today too -- I don't like the idea of removing the vignette.

dankelley commented 8 years ago

Cool, thanks. Bear in mind that we not only have to remove the NOTE but we probably need to be aware of the total package size, which we've had complaints about for a long time. I think I differ on the utility of the vignette. We have sweated bullets to reduce 20K here, 50K there from datasets, and I don't think we should make all that effort worthless for some pictures. We can always put the vignette on a blog. But you may convince me :smile:

richardsc commented 8 years ago

Ok. I'll think about it.

A note -- I just had a look at the marmap vignettes, and the total there is about 2 MB. Maybe the rest of the package is small so it doesn't matter, but I expect there's a way to make the vignette smaller.

dankelley commented 8 years ago

Any thoughts on the timing? I can have a go at this, and it would be easier for me to do since this stuff is "in my head" right now ... but I don't want to have merge conflicts if you spend the day rewriting things and I also spend the day going over the same lines in the .Rmd. What I'd like to do is

Note: it's hard to predict the reduction in filesize because png is compressed. But I definitely think we need to get it small enough to avoid a NOTE on the specific file, and another NOTE on the whole package. I would also not like the whole package to be double in size compared to the last submit.

richardsc commented 8 years ago

Sorry, I won't be able to get to this during the day today -- off for a trip to the museum. I'll probably have some time this evening, after you've finished for the day.

There must be an Rmarkdown preference for the yaml section that specifies the resolution of the images -- effectively the res argument for png. That can make a big difference for shrinking pngs.

Also, why are we doing an HTML vignette, when in the past we've had pdf? Is there a way to have the Rmarkdown convert to a pdf (i.e. through latex), so that it will be a small document?

dankelley commented 8 years ago

The PDF is 1.8M, compared with the HTML of 1.5M. So that's not the solution.

richardsc commented 8 years ago

What about explicitly setting the resolution for the plots in the chunk options using the res or dpi chunk options?

http://yihui.name/knitr/options/

richardsc commented 8 years ago

This link might also be relevant

https://groups.google.com/forum/m/#!topic/knitr/Jkc9HWa4zl8

dankelley commented 8 years ago

Thanks for the ideas -- I'm doing these a bit at a time and monitoring. I don't have the old oce.pdf vignette anymore... can you post a comment of how big that is, if you're on a computer and not a phone? I'd like to have a target. I can definitely get it under 1.5M so the note will go away, but of course I don't want the note on the overall size, either.

richardsc commented 8 years ago

I'm not at a computer now, but it should be the same as the CRAN one I think

https://cran.r-project.org/web/packages/oce/vignettes/oce.pdf

dankelley commented 8 years ago

Thanks. I've got it down to 1.27Mb with DPI alteration and the removal of a figure we did not really need. The old pdf was 0.34Mb, so a rough statement is that the new figures have cost 1Mb, or 3X the size of our largest data file.

I will continue along these lines. I think some figures are redundant, e.g. I have a world view in 3 different projections that are only different in subtle features. The code is there and a person can easily cut/paste to get high res on their screen, so I think I'll remove some of those figures. I would like to get it under 1Mb, just as a target (not for any particular reason -- it's like a 4-minute mile thing).

You should stop looking at your phone and enjoy the day. I'll twiddle with this while watching a movie on tv.

dankelley commented 8 years ago

I now have inst/doc/oce.html down to 634K, or a bit under double the old inst/doc/oce.pdf size, but with several new figures. My guess is that dropping the dpi from 72 to less might help a bit more, if we need that. We could also chop a figure or two, e.g. it occurs to me that the plotScan example is not going to be of any interest to 90% of users, who are working with (cleaned-up) archived data. At any rate, I'm going to close this. See also my note at #859.