kindredgroup / puppet-forge-server

Private Puppet forge server supports local files and both v1 and v3 API proxies
69 stars 44 forks source link

implements some other v1 download route #25

Closed danilopopeye closed 8 years ago

danilopopeye commented 8 years ago
  GET /system/releases/m/user/user-module.tar.gz

this route is used by Katello to download modules

ref.: https://projects.puppetlabs.com/projects/module-site/wiki/Server-api

i11 commented 8 years ago

Thanks for the pull request! But are you sure it's hardcoded to pull modules like that? The URL used for fetching modules is provided in the releases json payload and actually can be whatever, so it would be very strange to be tight to a specific end point. Do you have any use case I could examine?

danilopopeye commented 8 years ago

you can try the forge urls bellow:

https://forge.puppetlabs.com/api/v1/releases.json?module=puppetlabs/stdlib

{
    "puppetlabs/stdlib": [
        {
            "file": "/system/releases/p/puppetlabs/puppetlabs-stdlib-0.1.1.tar.gz",
            "version": "0.1.1",
            "dependencies": []
        },
        {
            "file": "/system/releases/p/puppetlabs/puppetlabs-stdlib-0.1.2.tar.gz",
            "version": "0.1.2",
            "dependencies": []
        },
        // ...
}

and with that you can download the module using the received path:

https://forge.puppetlabs.com/system/releases/p/puppetlabs/puppetlabs-stdlib-0.1.1.tar.gz
i11 commented 8 years ago

Yeah, that's what I was referring to. The server will have its own file URLs and it doesn't have to be exactly the same as with the official forge. So it should be working fine even now.

danilopopeye commented 8 years ago

but the GET /modules.json does not return a module path: api/v1/modules.rb

And, this is the log of a Katello sync to the forge server:

192.168.1.3 - - [06/Aug/2015:14:04:27] "GET /PULP_MANIFEST HTTP/1.1" 404 9
192.168.1.3 - - [06/Aug/2015 14:04:28] "GET /modules.json HTTP/1.1" 200 496 0.0137
192.168.1.3 - - [06/Aug/2015 14:04:29] "GET /system/releases/u/uoldiveo/uoldiveo-postgresql_uoldiveo-0.1.0.tar.gz HTTP/1.1" 200 3576 0.0007
192.168.1.3 - - [06/Aug/2015 14:04:30] "GET /system/releases/u/uoldiveo/uoldiveo-htop-0.3.1.tar.gz HTTP/1.1" 200 92823 0.0019
#  this is the response of the request `/modules.json` above
[
    {
        "author": "uoldiveo",
        "full_name": "uoldiveo/htop",
        "name": "htop",
        "desc": null,
        "version": "0.3.1",
        "project_url": "http://gitlab.lab/puppet/uoldiveo-htop",
        "releases": [
            {
                "version": "0.3.1"
            },
            {
                "version": "0.3.0"
            },
            {
                "version": "0.2.0"
            },
            {
                "version": "0.1.0"
            }
        ],
        "tag_list": [
            "uoldiveo",
            "htop"
        ]
    },
    {
        "author": "uoldiveo",
        "full_name": "uoldiveo/postgresql_uoldiveo",
        "name": "postgresql_uoldiveo",
        "desc": null,
        "version": "0.1.0",
        "project_url": null,
        "releases": [
            {
                "version": "0.1.0"
            }
        ],
        "tag_list": [
            "uoldiveo",
            "postgresql_uoldiveo"
        ]
    }
]

I'm trying to find exactly where this request is made.

danilopopeye commented 8 years ago

@i11 this url is hardcoded in the Pulp Puppet project for module downloading:

# File name inside of a module where its metadata is found
MODULE_METADATA_FILENAME = 'metadata.json'

# Location in the repository where a module will be hosted
# Substitutions: author first character, author
HOSTED_MODULE_FILE_RELATIVE_PATH = 'system/releases/%s/%s/'

# Name template for a module
# Substitutions: author, name, version
MODULE_FILENAME = '%s-%s-%s.tar.gz'

# Location in Pulp where modules will be stored (the filename includes all
# of the uniqueness of the module, so we can keep this flat)
# Substitutions: filename
STORAGE_MODULE_RELATIVE_PATH = '%s'

pulp_puppet/common/constants.py

i11 commented 8 years ago

I haven't used Katello myself yet, so I can't comment much on its internals, however I think that it should be (and probably is) reading the releases json payload (e.g. https://forge.puppetlabs.com/api/v1/releases.json?module=puppetlabs/stdlib) and using file URLs from it to fetch the modules.

Module URLs you have in the log are returned by the forge.puppetlabs.com, but it doesn't have much to do with how this project serves modules appart the API of course. Modules file URL is not part of the API therefore doesn't have to be replicated.

Pulp constants just show that probably pulp is used to serve files for forge.puppetlabs.com, nothing more.

I probably could be more helpful if you would describe your use case and any particular problems you're trying to solve by this pull request.

danilopopeye commented 8 years ago

I haven't used Katello myself yet, so I can't comment much on its internals, however I think that it should be (and probably is) reading the releases json payload (e.g. [...]) and using file URLs from it to fetch the modules.

This was what I first though also, but if you look the logs I posted, there's no request to the /releases.json. The request to the file URLs, as you named it, seems to be hardcoded in the Pulp Puppet module. Seems it was that way before there was any Forge API definition from PuppetLabs. Just guessing here.

I probably could be more helpful if you would describe your use case and any particular problems you're trying to solve by this pull request.

Ok, so this is the use case:

Katello needs to sync our internal modules that are served by this gem. The problem is that they first it try to fetch a /PULP_MANIFEST that does not exists, since it's not a Forge created by Pulp Puppet helper. Then it fetches the /modules.json, and the sync task starts downloading all the packages that are listed in the /modules.json URL.

I've spent some time reading the docs and code of Pulp trying to find some way to force the v3 but couldn't find.

For all I've read so far about Katello and Pulp, the only two way that they sync Puppet modules are this file URL and using the PULP_MANIFEST they created.

i11 commented 8 years ago

Yeah. The main issue here is Pulp. It doesn't "speak" puppet, instead just assuming that modules will have certain URLs. Plain and simple, but not very robust. I would really prefer improving puppet pulp plugin instead of terminating this end point.

Some sort of a plugin feature might ease up proprietary implementations such as this, but it's not in place yet and probably it would take a while for me to add it. Nginx or apache would do much better job statically serving module files.

nginx example:

location ~ ^/system/releases/.+/.+/(.+\.tar\.gz)$ {
    alias /my/modules/directory/$1;
}
danilopopeye commented 8 years ago

yeah, I get your point and make sense. I'll use nginx or apache to solve this. :+1:

I'm posting the working version here in case someone happens to need.

location ~* ^/system/releases/[a-z]/.+/(.+)$ {
    return 301 /v3/files/$1;
}

thanks for the time @i11! I'm gonna close this, ok?

i11 commented 8 years ago

Sure thing. Sorry I couldn't be more helpful... Thank you for your input!