esell / deb-simple

A lightweight, bare-bones apt repository server
MIT License
238 stars 17 forks source link

Ability to use a directory containing deb pkgs #12

Open akshaymoghe opened 8 years ago

akshaymoghe commented 8 years ago

Most times we already have a directory full of packages that we'd like to have served via deb-simple. Is there an easy way to add a config flag that would allow this dir to be served instead of iterating over every file in the dir and POSTing it to the server (which takes a long time when the files are too many or too large).

Happy to take on the work if you think this is functionality you agree with (and have thoughts on how to implement).

esell commented 8 years ago

Just out curiosity, would you see this as a one-time thing (you pass this flag once and then the next start-up you do not) or an alternative way of using deb-simple all the time? As it sounds like in your situation, you might have another process that is putting these files on disk as opposed to POSTing them via a CI tool or something.

I think that it certainly is something I'd be interested in seeing a PR for if you are up for it. It'll be a few weeks for me before I can start working on the request so if you're able to knock something out before that I'm all for it!

akshaymoghe commented 8 years ago

Yes, as the end result of a build step, we have several (hundred) packages that end up in a directory. One option is to walk over this directory and POST each pkg to the deb-simple server. Another is to get deb-simple to recognize the dir and use the packages within it.

However I'm unable to reconcile its support for multiple architectures with this dir without making assumptions about what types of packages are available in this prepared dir. In our case, we're always targetting the same arch, but it does make for a long help string in the cmdline opts explaining how this directory is "special" in that it should contain pkgs of the same arch.

esell commented 8 years ago

@akshaymoghe: I just pushed a new branch that might make life easier on you. I updated the code to trigger repo updates based on filesystem changes so now in theory what you can do is start up deb-simple and then just copy your debs into the directories instead of having to POST them. I realize it's not quite what you were looking for but it might help a little :)

The new branch is here if you'd like to test: https://github.com/esell/deb-simple/tree/fsnotify

NOTE: this new branch uses GB to build so you'll need that.

amoghe commented 8 years ago

We could also invoke createPackagesGz once when the process starts, so that it can pick up debs that may have been placed in the dir before the process starts.

A build workflow may do the following:

  1. Create debs, put them in the dir
  2. Downstream job assumes dir is populated with debs, and start debsimple for its build step

Thoughts?

esell commented 8 years ago

I think it might make sense to add that functionality, I'm not sure if it would be best to make it just the default or make it a config/flag that you can pass. I'll think about it and push a PR for it soon though.

amoghe commented 8 years ago

Here are some more thoughts on this:

Currently deb-simple wants to be a "full" featured apt repo server, which means it exposes multiple distros and archs. However, what I want is to get packages installed on a target system without caring about what arch/distro they come from because I compiled them so I don't really care about those details. This is useful when building containers because its the easiest way to get packages into a container without them occupying space on disk (which would bloat up the container size significantly). For this use case something like this works better for me.

If we wanted deb-simple to do this, we'd need to encode the distro and arch into the dir hierarchy (which is how it works currently), but this means that the producer of the packages needs to create the packages with this dir hierarchy, or some intermediate job has to do copy packages around to conform to this hierarchy. This isn't impossible, but it isn't ideal.

However, if we re-architected the program so that it had a 'reposerver' module, which basically acts as an http servlet/module responsible for ONE distro/arch and serve the packages out of one dir, then the json config might be changed so that it contains one entry per servlet (distro/arch)

E.g.

{
  "repos": {
    "precise": {
      "arch":   "amd64",
      "pkgdir": "/some/place"
    },
    "trusty": {
      "arch":   "i386",
      "pkgdir": "/foo/bar"
    },

  "ssl":  true,
  "cert": "/path/to/cert",
  "key":  "/path/to/key",
  "port": 8080
}

... and in the naive case (which described when I opened this issue) it would contain just one entry to serve the packages from the specified dir.

{
  "repos": {
    "custom": {
      "arch": "amd64", 
      "pkgdir": "/path/to/my/pkgdir",
    }
  }
}

One could consider using a http module package like zenazn/goji or gorilla/mux to register multiple handlers as they are parsed from the config file.

This would also allow folks to reuse the module to write their own programs which may have different names hardcoded into the program so they can run without a config file. For example, I may import this package and write my own main.go that instantiates just one module and puts it in an http.Server with some hardcoded config like port/ssl.

E.g

svr := debsimple.NewRepoServer(distroName, archType, pkgDir)
http.Serve(svr) // or equivalent, with the necessary ssl config

Thoughts?

esell commented 8 years ago

At the risk of simplifying this too much, it sounds like at the lowest level you want to be able to specify what directory the packages are being served from with the added bonus of being able to do that multiple times for various distros/architectures/etc, does that sound about right?

I might need to think about this a bit... I agree that having that functionality would probably be nice but I want to come up with an easy (non-bloated) way to do that. The initial goal of this project was to keep the code as basic as possible and limit the dependencies so I want to play around with how I can stick to those goals while adding the functionality you are talking about.

Additionally I want to keep it so that someone who is not in the same boat as you can just fire this thing up with a minimum config and understanding on how the process works and start making packages available.