Debian / debiman

debiman generates a static manpage HTML repository out of a Debian archive
Apache License 2.0
188 stars 46 forks source link

link to older archives if present #91

Open anarcat opened 7 years ago

anarcat commented 7 years ago

I just had a thought that it would actually be really great if we had the history of older manpages from unsupported suites. For example, I was just looking for the manpage of dpkg-buildpackage for wheezy and it wasn't linked here:

https://manpages.debian.org/jessie/dpkg-dev/dpkg-buildpackage.1.en.html

yet it actually exists there:

https://manpages.debian.org/wheezy/dpkg-dev/dpkg-buildpackage.1.en.html

So I'd ask for two things, if I can:

  1. do link against oldoldstable if the suite is present
  2. more ambitiously: extract manpages for all available suites, possibly even going back to the snapshots

This would neatly fix the concerns with the "stability" of codenamed suites (expressed in #54): we just keep those forever, basically. From what I understand, the disk space usage isn't that critical that we should keep from doing this.

stapelberg commented 7 years ago

I think we could process oldoldstable (but possibly we need more RAM on manziarly for that). I can test this on my machine to see how much the RAM footprint increases.

I think processing manpages for all Debian releases is overwhelming, both in terms of significantly growing the resource requirements, and in terms of overloading the UI with a large list of releases.

stapelberg commented 7 years ago

In our current setup, we use about 1G of resident set size. With oldoldstable added, we use about 1.5G of resident set size.

Looking at manziarly, we’re already swapping during regular operation: https://munin.debian.org/debian.org/manziarly.debian.org/index.html#system

So, I see a couple of options:

  1. We just enable oldoldstable, prolonging the time during which manziarly serves slow redirects because it’s heavily swapping. I’d like to avoid this option.
  2. We ask DSA to increase the RAM on manziarly from 2G to 4G to eliminate swapping and have enough capacity for also processing oldoldstable. This is the simplest option, if we have the resources.
  3. We try to reduce debiman’s memory usage.
anarcat commented 7 years ago

i'd love to see 2 and 3 happen... :) it seems fairly straightforward to profile memory usage in go but i haven't played with that, personally.

i also wonder if we couldn't rely on archived suites not changing. for example, we could have links to the squeeze suite now and generate those manpages once without having to reparse the whole suite at every run. we just need to keep links for the relevant manpages... same probably applies to wheezy: it's unlikely that we have manpages changes in LTS...

wouldn't that approach save some resources? or it's too much of a design change?

stapelberg commented 7 years ago

You’re right regarding the content of the manpages, and rendering is indeed skipped already. Checking whether manpages need to be rendered only takes on the order of a few seconds for the entire corpus, so this is not worth optimizing.

Note, however, that the navigation panel on the manpages of all versions needs to be updated whenever any version changes. E.g., if a package gets removed from testing, it shouldn’t appear in the oldoldstable version’s navigation panel. Hence, we need to process all pages.