kubernetes / community

Kubernetes community content
Apache License 2.0
12.01k stars 5.17k forks source link

Create a URL shortener #2959

Closed justaugustus closed 3 years ago

justaugustus commented 5 years ago

...that isn't nginx. We currently use nginx configs here to create vanity URLs for kubernetes web properties, but it would be great to have something more robust, self-service, inserts more descriptors.

I don't have specific parameters for what we'd want this to look like, just capturing, so it doesn't get lost in Slack history.

/good-first-issue /sig contributor-experience /cc @tpepper @BenTheElder @mrbobbytables

BenTheElder commented 5 years ago

IMHO, it would be great if it had a simple config file checked in still, but like prow or peribolos, it should automatically be deployed on merge. That way we can still have a team own / review proposed short links but avoid anyone needing to manually deploy it or understand nginx config. something like:

links:
# becomes: `https://go.k8s.io/community-repo` redirect to `https://github.com/kubernetes/community/`
- community-repo: https://github.com/kubernetes/community/
# becomes: `https://go.k8s.io/sig-beard` redirect to https://github.com/spiffxp
- sig-beard: https://github.com/spiffxp
justaugustus commented 5 years ago

@BenTheElder -- I dig it!

ameukam commented 5 years ago

Do we want to build something from scratch or kubernetize open source tools like polr ?

cblecker commented 5 years ago

/remove-good-first-issue Definitely a help wanted, but it's not a defined enough request for gfi. There will need to be some back and forth.

BenTheElder commented 5 years ago

I'd actually suggest implementing it ourselves, since probably the interesting part is having a very minimal config.

A hypothetical home-grown service would just need to read the inbound URL and respond with an HTTP 301 redirect matching the config, or a 404 if there's no entry. This could be written in a very tiny bit of go and deployed to the existing k8s.io cluster, reading the config from a configmap. It doesn't need to be complex, we're not serving at crazy scales and redirects are cheap.

Most of the third party tools are going to be very nice when you want to allow creating redirects from a UI, but I think we might not want to do that since there's no easy way to track or review it. We don't want k8s.io serving redirects to malware etc, so it probably needs to be reviewable.

Ideally we'd basically never need to touch the go code or deployment, and just automate pushing the new configmap contents from the in-repo config (which prow / peribolos already have tooling for).

We'd need agreement on such a design, but if we did go this route it probably would be a good first issue at that point, it should be very simple to implement.

Edit: we should definitely look around to see if exactly this exists already, but so far I've seen tools that are much fancier than what we need, and targetting a web ui rather than having a config file. I didn't dig for long though.

cblecker commented 5 years ago

I'm also not opposed to a commercial saas offering though, if branding/marketing wanted to be able to track the use of redirects. Find which short links are being used, and which ones aren't, stuff like that.

stp-ip commented 5 years ago

There is a project based on DNS TXT records: https://about.txtdirect.org We also have a hosted instance running to test it out: https://about.txtdirect.org/hosted/

Internally and especially for our open source projects we use a git repository with the various BIND zone files to configure this. The interesting thing about this is that it's based on DNS and therefore allows for delegation. Say we wanna have a global shortener and some sig specific ones:

;configure instance and path type
s.k8s.io                    86000 IN CNAME   txtd.io.
_redirect.s.k8s.io          86400 IN TXT     "v=txtv0;to=https://kubernetes.io{uri};root=https://kubernetes.io/;type=path"

;global redirects, if they don't match redirect to kubernetes.io
_redirect.shortened.g.s.k8s.io   86400 IN TXT     "v=txtv0;to=https://kubernetes.io/global;type=host;code=302"

;specific redirects could also be delegated to other zones
_redirect.shortened.siw-aws.s.k8s.io   86400 IN TXT     "v=txtv0;to=https://kubernetes.io/sig-aws-specific-redirect;type=host;code=302"

The above would enable: s.k8s.io/some-random-thing -> kubernetes.io/some-random-thing (fallback) s.k8s.io -> kubernetes.io (root fallback) s.k8s.io/g/shortened -> kubernetes.io/global s.k8s.io/sig-aws/shortened -> kubernetes.io/sig-aws-specific-redirect

It can also redirect dynamically or extract thing via regex and use placeholders.

It also enables to delegate whole subpaths to a specific zone. Say the sig-aws and therefore control over s.k8s.io/sig-aws/* could be delegated to a zone or just a file they control. Later could be managed individually by each sig via BIND files that are served via DNS. We use CoreDNS for this and are working on an easy platform to link a repository to be served by a reliable DNS provider.

There are also metrics available via prometheus. The project is based on Caddy and also enables auto TLS.

Bonus point: We are working on container vanity URLs so we could later move our container images to the same redirect format on a separate subdomain for example such as c.k8s.io/kubernetes/coredns.

(Disclaimer: Maintainer and creator)

BenTheElder commented 5 years ago

Interesting.

We do have another project with dns management in the works, which will allow similar ownership and management of the domains at some point IIRC, sounds like you might be able to lend a hand with that :-)


@cblecker A commercial service seems like overkill for serving some 301s / 404s imho, but perhaps not :^)

thockin commented 5 years ago

I'd rather kubernetize an OSS project, if we can. We should not be reinventing everything just for fun.

On Mon, Nov 26, 2018 at 4:10 PM Benjamin Elder notifications@github.com wrote:

Interesting.

We do have another project with dns management https://github.com/kubernetes/k8s.io/tree/master/dns in the works, which will allow similar ownership and management of the domains at some point IIRC, sounds like you might be able to lend a hand with that :-)

@cblecker https://github.com/cblecker A commercial service seems like overkill for serving some 301s / 404s imho, but perhaps not :^)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/kubernetes/community/issues/2959#issuecomment-441848373, or mute the thread https://github.com/notifications/unsubscribe-auth/AFVgVPkg0RbKEphw29D-pjjpcIDIj1kgks5uzILkgaJpZM4Yt948 .

BenTheElder commented 5 years ago

I would agree, but I am not seeing something particularly suitable currently.

https://polrproject.org/ (previously mentioned) seems to be the most popular OSS one, but it's based around a PHP web UI / REST API, and requires a SQL database. Other projects seem very similar.

imho the ideal solution would just read declarative config from a configmap, rather than a database, I haven't spotted anything easily adapted to that. We could of course write a controller to talk to one of these REST APIs, but I suspect that's going to be more complex than serving redirects.

If we don't want the declarative config then probably we could probably just deploy Polr somewhere today, though we probably need to think about moderation.

On Mon, Nov 26, 2018 at 4:12 PM Tim Hockin notifications@github.com wrote:

I'd rather kubernetize an OSS project, if we can. We should not be reinventing everything just for fun.

On Mon, Nov 26, 2018 at 4:10 PM Benjamin Elder notifications@github.com wrote:

Interesting.

We do have another project with dns management https://github.com/kubernetes/k8s.io/tree/master/dns in the works, which will allow similar ownership and management of the domains at some point IIRC, sounds like you might be able to lend a hand with that :-)

@cblecker https://github.com/cblecker A commercial service seems like overkill for serving some 301s / 404s imho, but perhaps not :^)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub < https://github.com/kubernetes/community/issues/2959#issuecomment-441848373 , or mute the thread < https://github.com/notifications/unsubscribe-auth/AFVgVPkg0RbKEphw29D-pjjpcIDIj1kgks5uzILkgaJpZM4Yt948

.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kubernetes/community/issues/2959#issuecomment-441848730, or mute the thread https://github.com/notifications/unsubscribe-auth/AA4Bq66XLyQVS1GZTZjHRhu3_G5-VBfzks5uzINigaJpZM4Yt948 .

thockin commented 5 years ago

If we really can't find one, then making one is fine, I guess. It's not like it is a complicated program.

On Mon, Nov 26, 2018 at 4:44 PM Benjamin Elder notifications@github.com wrote:

I would agree, but I am not seeing something particularly suitable currently.

https://polrproject.org/ (previously mentioned) seems to be the most popular OSS one, but it's based around a PHP web UI / REST API, and requires a SQL database. Other projects seem very similar.

imho the ideal solution would just read declarative config from a configmap, rather than a database, I haven't spotted anything easily adapted to that. We could of course write a controller to talk to one of these REST APIs, but I suspect that's going to be more complex than serving redirects.

If we don't want the declarative config then probably we could probably just deploy Polr somewhere today, though we probably need to think about moderation.

On Mon, Nov 26, 2018 at 4:12 PM Tim Hockin notifications@github.com wrote:

I'd rather kubernetize an OSS project, if we can. We should not be reinventing everything just for fun.

On Mon, Nov 26, 2018 at 4:10 PM Benjamin Elder <notifications@github.com

wrote:

Interesting.

We do have another project with dns management https://github.com/kubernetes/k8s.io/tree/master/dns in the works, which will allow similar ownership and management of the domains at some point IIRC, sounds like you might be able to lend a hand with that :-)

@cblecker https://github.com/cblecker A commercial service seems like overkill for serving some 301s / 404s imho, but perhaps not :^)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <

https://github.com/kubernetes/community/issues/2959#issuecomment-441848373

, or mute the thread <

https://github.com/notifications/unsubscribe-auth/AFVgVPkg0RbKEphw29D-pjjpcIDIj1kgks5uzILkgaJpZM4Yt948

.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub < https://github.com/kubernetes/community/issues/2959#issuecomment-441848730 , or mute the thread < https://github.com/notifications/unsubscribe-auth/AA4Bq66XLyQVS1GZTZjHRhu3_G5-VBfzks5uzINigaJpZM4Yt948

.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/kubernetes/community/issues/2959#issuecomment-441854883, or mute the thread https://github.com/notifications/unsubscribe-auth/AFVgVH9jLT7LQ9bZsf0ktieh7COcQn9kks5uzIr3gaJpZM4Yt948 .

stp-ip commented 5 years ago

As a sidenote the current hosted instance and the underlying git -> DNS stuff is running in K8s.

Basically: git-sync (sidecar) pulling in a git repository with bind files + CoreDNS serving the DNS <-- our own DNS records CoreDNS sends notifies to our DNS provider on new commits DNS provider is fronting our CoreDNS instance for reliability and speed TXTDirect (caddy) with autoTLS (besides the TLS certs stateless) + CoreDNS cache (sidecar) <-- txtd.io stack

Not sure, if that counts as kubernetize. Definitely happy to help.

Sidenote: We just merged our first iteration of docker vanity urls. docker pull c.txtdirect.org/txtdirect:dev-3fd9be

Configuration:

c.txtdirect.org.                  86400 IN CNAME   txtd.io.
_redirect.c.txtdirect.org.        86400 IN TXT     "v=txtv0;to=https://gcr.io/txtdirect-223710/txtdirect;root=https://about.txtdirect.org;type=dockerv2"
fejta-bot commented 5 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

BenTheElder commented 5 years ago

I completely ran out of bandwidth to look into options for this further.

Totally agree we should try to find / select one to re-use first and get that setup.

We do at least have DNS managed by config now.. perhaps the TXT record redirector could be integrated with that.

Pretty sure ContribEx still wants this. /remove-lifecycle stale

nikhita commented 5 years ago

/kind feature /priority important-longterm

misterikkit commented 5 years ago

Mind if I help with this?

https://github.com/shlinkio/shlink has an MIT license, but more importantly, they have the best name and logo of any open source URL shortener.

I was able to figure out serving a "static" set of short URLs defined in the deployment yaml. Assuming the deployment yaml is checked in somewhere, we can code review new short links.

BenTheElder commented 5 years ago

/assign @misterikkit :tada:

BenTheElder commented 5 years ago

/remove-help

thockin commented 5 years ago

Can we talk pull config from a file, so we can use configmap? Or maybe we can give it a helper that watches the file and send SIGHUP or something?

On Thu, Mar 28, 2019, 4:24 PM Jonathan Basseri notifications@github.com wrote:

Mind if I help with this?

https://github.com/shlinkio/shlink has an MIT license, but more importantly, they have the best name and logo of any open source URL shortener.

I was able to figure out serving a "static" set of short URLs defined in the deployment yaml. Assuming the deployment yaml is checked in somewhere, we can code review new short links.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/kubernetes/community/issues/2959#issuecomment-477806941, or mute the thread https://github.com/notifications/unsubscribe-auth/AFVgVPlusl6thZHRDTvpLnvriYHoFlJnks5vbU9GgaJpZM4Yt948 .

misterikkit commented 5 years ago

@thockin that's actually what I ended up doing. (https://github.com/kubernetes/k8s.io/pull/211). It currently will exit(0) on configmap change, causing a container restart within the pod. (is that an antipattern?) Causes global downtime no matter how many replicas, but it's good enough for now.

I'll subscribe to https://github.com/kubernetes/kubernetes/issues/22368 ...

stp-ip commented 5 years ago

Would have loved to see the now managed DNS config to be used via the TXT records, but happy to have a solution \o/. If the decision ever comes back to do it via TXT records and TXTDirect, let me know. I'll also be at Next to discuss this in detail.

@misterikkit I definitely oppose the viewpoint on best name and logo :smiling_imp:

misterikkit commented 5 years ago

@stp-ip Could you explain why using DNS TXT records is a preferable solution? Is it easier to manage?

stp-ip commented 5 years ago

Not specifically preferable or easier. It depends on the viewpoint, but it can work with the new DNS setup and review process in k8s. The application is mostly stateless.

Additionally it might open up the avenue to not just use it for redirects but also container image urls, go pkg urls and git repos later on, if that ever wants to be controlled at the endpoint level.

It also has support for various dynamic redirects using placeholders such as splitting up redirects depending on http method, cookie or more: https://about.txtdirect.org/docs/placeholders/

A feature interesting is that DNS comes automatically with delegation. So a subdomain or subpath could be easily delegated to a sig, which controls their own DNS repo etc.

Sidepoint is that the current PoC is written in Go and based on Caddy, which enables autoTLS and additional configuration out of the box. So upstreaming changes might be easier for contributors.

misterikkit commented 5 years ago

Copying some of @thockin's feedback here:

I don't know how shlink or its DB works

we'd need a plan (scripted) for how to test changes before going live. We'll want a static IP assigned and in the YAML, and probably some monitoring/alerting. We'll need resource request (cpu) and limit (memory). If we want HA we'll need 2 replicas and a random backoff before terminating and restarting

It might even be better to make our own Docker image here, so we can push it to GCR and get vulnerability scans on it, which means you could un-embed the script.

Now that I have a better understanding of txtdirect, I think it does a good job of addressing these points.

cblecker commented 5 years ago

Having a cursory look myself, I'm +1 on txtdirect. DNS as a database 👍

thockin commented 5 years ago

My grumble with txtdirect is taking a dep on a free service.

On Thu, Apr 11, 2019 at 4:47 PM Christoph Blecker notifications@github.com wrote:

Having a cursory look myself, I'm +1 on txtdirect. DNS as a database 👍

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kubernetes/community/issues/2959#issuecomment-482375158, or mute the thread https://github.com/notifications/unsubscribe-auth/ABKWAVC4HIJUJDHMZGJBLQDPP7E2VANCNFSM4GFX3Y6A .

cblecker commented 5 years ago

As I mentioned on the PR, it's an OSS project that we can host ourselves.

stp-ip commented 5 years ago

As mentioned on the PR and a few additions to give more context:

OSS: github.com/txtdirect/txtdirect OSS docs: github.com/txtdirect/website OSS builds using the plain makefile: https://gcr.io/v2/txtdirect-223710/txtdirect/tags/list Built on caddy webserver: github.com/mholt/caddy

Incorporates prometheus metrics, does autoTLS and has k8s config files available on request including a sidecar DNS cache (coredns).

thockin commented 5 years ago

Ooops. Sorry. The only link I clicked took me to a hosted service.

On Fri, Apr 12, 2019 at 5:14 PM Michael Grosser notifications@github.com wrote:

As mentioned on the PR and a few additions to give more context:

OSS: github.com/txtdirect/txtdirect OSS docs: github.com/txtdirect/website OSS builds using the plain makefile: https://gcr.io/v2/txtdirect-223710/txtdirect/tags/list Built on a caddy webserver: github.com/mholt/caddy

Incorporates prometheus metrics, does autoTLS and has k8s config files available on request including a sidecar DNS cache (coredns).

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kubernetes/community/issues/2959#issuecomment-482758275, or mute the thread https://github.com/notifications/unsubscribe-auth/ABKWAVA6SW73FFLYB57EBRLPQEQWVANCNFSM4GFX3Y6A .

stp-ip commented 5 years ago

What's the current canary process?

Happy to work with @misterikkit to get the self-hosted part setup.

So plan of action?:

thockin commented 5 years ago

DNS Canary process is to push changes to zones "canary.k8s.io" and " canary.kubernetes.io" and then verify that all the records we expect to find are present.

If we add a "go.k8s.io" delegation and zone files, we'll want a similar process. Even if we don't do a delegation, really.

We'll also want to make sure the test covers these TXT records.

On Mon, Apr 15, 2019 at 11:17 PM Michael Grosser notifications@github.com wrote:

What's the current canary process?

Happy to work with @misterikkit https://github.com/misterikkit to get the self-hosted part setup.

So plan of action?:

  • merge specific tmp subdomain go2.k8s.io to test out general/canary process using txtd.io
  • get TXTDirect setup in our k8s cluster
  • move tmp subdomain to self-hosted version
  • test all the things \o/
  • switch go.k8s.io

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kubernetes/community/issues/2959#issuecomment-483526000, or mute the thread https://github.com/notifications/unsubscribe-auth/ABKWAVHQZ5GLM3AS4DXZXHDPQVVQ7ANCNFSM4GFX3Y6A .

cblecker commented 5 years ago

The other thing we'll need is the manifests to install on our cluster.

Similar to the nginx redirector (https://github.com/kubernetes/k8s.io/tree/master/k8s.io), we'll need a deployment, service, ingress, certificate, and optionally a config map if there is any config that needs to be passed. We should have a canary and prod one, just like the current nginx redirector.

stp-ip commented 5 years ago

This PR is planned for next weekish. Still trying to understand the canary.sh and if that is necessary. Subdirectory suggested would be /redirect.k8s.io, which also serves as the DNS for the service. That way we can CNAME this on go.k8s.io first and other redirect later, if necessary. Don't wanna move things multiple times.

I'll get these PRs ready.

stp-ip commented 5 years ago
fejta-bot commented 5 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

cblecker commented 5 years ago

/remove-lifecycle stale

fejta-bot commented 5 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

cblecker commented 5 years ago

/remove-lifecycle stale

fejta-bot commented 4 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

stp-ip commented 4 years ago

/remove-lifecycle stale

stp-ip commented 4 years ago

/unassign misterikkit /assign

fejta-bot commented 4 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

stp-ip commented 4 years ago

/remove-lifecycle stale

fejta-bot commented 4 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

fejta-bot commented 4 years ago

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle rotten

stp-ip commented 4 years ago

/remove-lifecycle rotten

mrbobbytables commented 4 years ago

@stp-ip do you know if there has been any traction with this by chance? :x

stp-ip commented 4 years ago

Slow progress especially due to us having the nginx in place.

mrbobbytables commented 4 years ago

kk, no worries - I just wanted to double check 👍

fejta-bot commented 3 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale