mrkschan / cuttle

:octopus: - Rate limit HTTP API calls per access token
MIT License
48 stars 14 forks source link

Segmentation Fault #16

Open andresdouglas opened 8 years ago

andresdouglas commented 8 years ago

It seems like it segfaulted with the following conf file

Main: Forwarding request to https://name.myshopify.com:443/admin/shop.json
Segmentation fault (core dumped)
root@:/var/cuttle# cat cuttle.yml
addr: :3128
cacert: './cacert.pem'
cakey: './privkey.pem'
zones:
  - host: "*.myshopify.com"
    shared: false
    control: rps
    rate: 2
  - host: "*"
    shared: true
    control: noop
andresdouglas commented 8 years ago

OK, it seems like the segfault happens somewhat randomly. the following also caused it to happen. Ran it again and it's working, but segfaults once in a while.

# /var/cuttle/bin/cuttle -f /var/cuttle/cuttle.yml
INFO[2016-06-10T15:10:27-04:00] Listening on :3128
INFO[2016-06-10T15:10:32-04:00] Main: Forwarding request to https://domain-staging.myshopify.com:443/admin/products/count.json
Segmentation fault (core dumped)
# cat /var/cuttle/cuttle.yml
addr: :3128
cacert: '/var/cuttle/cacert.pem'
cakey: '/var/cuttle/privkey.pem'
zones:
  - host: "*.myshopify.com"
    shared: false
    control: rps
    rate: 2
mrkschan commented 8 years ago

@andresdouglas which golang version do you use to compile cuttle?

andresdouglas commented 8 years ago
# go version
go version xgcc (Ubuntu 4.9.3-0ubuntu4) 4.9.3 linux/amd64
mrkschan commented 8 years ago

@andresdouglas are you using go compiler from golang.org?

My version string looks totally different.

go version
go version go1.6.2 linux/amd64

I didn't know there is a v4.9.3...

andresdouglas commented 8 years ago

OK, did some updating of packages and re-installing of go. Now on

# go version
go version go1.2.1 linux/amd64

It seems like it's still a bit wonky. The service seems to continue to run, but on the client side I'm getting a

<urlopen error [Errno 54] Connection reset by peer>

In /var/log/cuttle.err I only get info logs like:

time="2016-06-09T16:43:30-04:00" level=info msg="Listening on :3128"
time="2016-06-09T16:44:32-04:00" level=info msg="Main: Forwarding request to https://[...]-staging.myshopify.com:443/admin/shop.json"
time="2016-06-09T16:45:14-04:00" level=info msg="Listening on :3128"
time="2016-06-09T16:45:43-04:00" level=info msg="Main: Forwarding request to https://[...]-staging.myshopify.com:443/admin/shop.json"

Should it work with go1.2.1 or do I need to update to go1.6?

Thanks!

andresdouglas commented 8 years ago

Just installed go 1.6. I re-compiled cuttle with go 1.6. Should running the new ./bin/cuttle be sufficient or do I also have to add the 1.6 go binary to the path when running it?

andresdouglas commented 8 years ago

Update: no longer segfaults after compiling with 1.6 but after a few Tens of API calls routed through it, it returns a Bad Status Line "" (error in Django)

andresdouglas commented 8 years ago

Found something rather interesting when running the proxy manually (only way it seems to print full logging). Seems like a bunch of people - and by this I mean thousands of requests per minute - found and have been using the proxy!

screen shot 2016-06-21 at 5 30 50 pm

This is what my cuttle.yml file looks like atm:

# cat cuttle.yml
addr: :3128
cacert: '/var/cuttle/cacert.pem'
cakey: '/var/cuttle/privkey.pem'
zones:
  - host: "*.myshopify.com"
    shared: false
    control: rps
    rate: 2
  - host: "*"  # Apply to requests forwarded to all domains.
    shared: true  # The rate limit is shared by all domains.
    control: rps  # Use request-per-second rate limit control.
    rate: 2  # At most 2 requests per second in the entire zone.

Is there a way to limit who can route requests through the proxy (whitelisting?), or limit requests to be routed to the shopify domain only?

I tried setting the rate:0 for the wildcard, but get:

# ./bin/cuttle -f cuttle.yml
INFO[2016-06-22T03:39:17-04:00] Listening on :3128
panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xb code=0x1 addr=0x0 pc=0x5575d8]

goroutine 8 [running]:
panic(0x7bc3e0, 0xc8200100e0)
    /usr/local/go/src/runtime/panic.go:464 +0x3e6
github.com/mrkschan/cuttle/cuttle.(*RPSControl).Start.func1(0xc82010edb0)
    /var/cuttle/src/github.com/mrkschan/cuttle/cuttle/limitcontrol.go:71 +0x908
created by github.com/mrkschan/cuttle/cuttle.(*RPSControl).Start
    /var/cuttle/src/github.com/mrkschan/cuttle/cuttle/limitcontrol.go:88 +0x35
mrkschan commented 8 years ago

Oops... You're exposing the proxy over the Internet and thus you will be found by port scanner :)

Anyway, i'm thinking if cuttle should handle whitelisting / blacklisting client. The design of request per second is not compatible with 0. I think we need a input validation there to prevent such a setup. On the other hand, it is possible to add a control that basically blocks all the access.

A workaround is to use firewall for client whitelist or setup a upstream proxy for out-going traffic whitelisting.

mrkschan commented 8 years ago

Also, you may consider setting up a secure channel (e.g. VPN / SSH tunnel / sshuttle) to prevent exposing cuttle on to the Internet.

andresdouglas commented 8 years ago

Ha, yes. I think it eventually fails because of this

O[2016-06-22T04:07:51-04:00] RPSControl[host:*]: Waiting for 1000ms.
INFO[2016-06-22T04:07:51-04:00] Main: Forwarding request to http://i.y.qq.com/pcmusic/fcgi-bin/qm_rplstingmus.fcg?version=12&miniversion=57&uin=300000280&key=&guid=&gkey=&musicid=106867039&fromtag=0&music=0&errcode=0&level=1&fileid=0&hideuin=335266366&method=1&pcachetime=1466582571
2016/06/22 04:07:51 http: Accept error: accept tcp [::]:3128: accept4: too many open files; retrying in 5ms
2016/06/22 04:07:51 http: Accept error: accept tcp [::]:3128: accept4: too many open files; retrying in 10ms
2016/06/22 04:07:51 http: Accept error: accept tcp [::]:3128: accept4: too many open files; retrying in 20ms
2016/06/22 04:07:51 http: Accept error: accept tcp [::]:3128: accept4: too many open files; retrying in 40ms
2016/06/22 04:07:51 http: Accept error: accept tcp [::]:3128: accept4: too many open files; retrying in 80ms
2016/06/22 04:07:51 http: Accept error: accept tcp [::]:3128: accept4: too many open files; retrying in 160ms
2016/06/22 04:07:52 http: Accept error: accept tcp [::]:3128: accept4: too many open files; retrying in 320ms
andresdouglas commented 8 years ago

What would be the simplest thing to set up to prevent exposing it to the world? I'm running our application server on heroku, and cuttle on digital ocean. It may be hard to do an ssh tunnel from heroku. On the other hand I would also need to connect to cuttle from a couple dev laptops that likely change IPs frequently.

mrkschan commented 8 years ago

Since you're on Heroku, you cannot have a secure channel afaik. Would you consider having a upstream proxy that blocks by URL as a workaround? You may try https://steelmon.wordpress.com/2009/11/22/setting-up-a-strict-whitelist-proxy-server-using-squid/

andresdouglas commented 8 years ago

Ah, yes I setup a firewall no problem, but was disappointed when I tried to figure out how to get my heroku instance's static IP.

Yes, I'll try setting up squid. Will that prevent the issue I show above where it seems like we run out of file descriptors?

Second question: how do I point cuttle to the upstream squid proxy?

It seems like cuttle would make for a good heroku add-on. I'd pay for it. Have you thought about that?

thanks @mrkschan

andresdouglas commented 8 years ago

OK, squid set up, but not sure how to route requests from cuttle to squid, or should the requests go from squid to cuttle?

mrkschan commented 8 years ago

run cuttle with environment variable ...

https_proxy=https://..... cuttle ....

On Wed, Jun 22, 2016 at 5:43 PM, Andres Douglas notifications@github.com wrote:

OK, squid set up, but not sure how to route requests from cuttle to squid, or should the requests go from squid to cuttle?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/mrkschan/cuttle/issues/16#issuecomment-227694582, or mute the thread https://github.com/notifications/unsubscribe/AAHNg3uw8yObN7M2nc8DchHAM2WgpQbqks5qOQOogaJpZM4IyiYl .

[image: KS Chan on about.me]

KS Chan about.me/mrkschan http://about.me/mrkschan

andresdouglas commented 8 years ago

Thanks for the reply @mrkschan

I just ran it as https_proxy='https://127.0.0.1:3129' ./b/cuttle -f cuttle.yml. That doesn't prevent the requests from reaching cuttle, so I still ran into the problem mentioned above with open files:

INFO[2016-06-22T06:02:33-04:00] RPSControl[host:*]: Waiting for 1000ms.
INFO[2016-06-22T06:02:33-04:00] Main: Forwarding request to http://stat.pc.music.qq.com/fcgi-bin/qm_reportlstmus.fcg?pcachetime=1466589548
2016/06/22 06:02:33 http: Accept error: accept tcp [::]:3128: accept4: too many open files; retrying in 320ms
2016/06/22 06:02:33 http: Accept error: accept tcp [::]:3128: accept4: too many open files; retrying in 5ms
2016/06/22 06:02:33 http: Accept error: accept tcp [::]:3128: accept4: too many open files; retrying in 10ms
2016/06/22 06:02:33 http: Accept error: accept tcp [::]:3128: accept4: too many open files; retrying in 20ms
2016/06/22 06:02:33 http: Accept error: accept tcp [::]:3128: accept4: too many open files; retrying in 40ms
2016/06/22 06:02:33 http: Accept error: accept tcp [::]:3128: accept4: too many open files; retrying in 80ms
2016/06/22 06:02:33 http: Accept error: accept tcp [::]:3128: accept4: too many open files; retrying in 160ms
2016/06/22 06:02:33 http: Accept error: accept tcp [::]:3128: accept4: too many open files; retrying in 320ms
INFO[2016-06-22T06:02:34-04:00] RPSControl[host:*]: Waiting for 1000ms.
INFO[2016-06-22T06:02:34-04:00] Main: Forwarding request to http://stat.pc.music.qq.com/fcgi-bin/qm_reportlstmus.fcg?pcachetime=1466589548

Would it make sense to run squid in front of cuttle?

mrkschan commented 8 years ago

Oh, you're right. Though, I don't know how to run Squid in front of Cuttle.

On Wed, Jun 22, 2016 at 6:05 PM, Andres Douglas notifications@github.com wrote:

Thanks for the reply @mrkschan https://github.com/mrkschan

I just ran it as https_proxy='https://127.0.0.1:3129' ./b/cuttle -f cuttle.yml. That doesn't prevent the requests from reaching cuttle, so I still ran into the problem mentioned above with open files:

INFO[2016-06-22T06:02:33-04:00] RPSControl[host:_]: Waiting for 1000ms. INFO[2016-06-22T06:02:33-04:00] Main: Forwarding request to http://stat.pc.music.qq.com/fcgi-bin/qm_reportlstmus.fcg?pcachetime=1466589548 2016/06/22 http://stat.pc.music.qq.com/fcgi-bin/qm_reportlstmus.fcg?pcachetime=14665895482016/06/22 06:02:33 http: Accept error: accept tcp [::]:3128: accept4: too many open files; retrying in 320ms 2016/06/22 06:02:33 http: Accept error: accept tcp [::]:3128: accept4: too many open files; retrying in 5ms 2016/06/22 06:02:33 http: Accept error: accept tcp [::]:3128: accept4: too many open files; retrying in 10ms 2016/06/22 06:02:33 http: Accept error: accept tcp [::]:3128: accept4: too many open files; retrying in 20ms 2016/06/22 06:02:33 http: Accept error: accept tcp [::]:3128: accept4: too many open files; retrying in 40ms 2016/06/22 06:02:33 http: Accept error: accept tcp [::]:3128: accept4: too many open files; retrying in 80ms 2016/06/22 06:02:33 http: Accept error: accept tcp [::]:3128: accept4: too many open files; retrying in 160ms 2016/06/22 06:02:33 http: Accept error: accept tcp [::]:3128: accept4: too many open files; retrying in 320ms INFO[2016-06-22T06:02:34-04:00] RPSControl[host:_]: Waiting for 1000ms. INFO[2016-06-22T06:02:34-04:00] Main: Forwarding request to http://stat.pc.music.qq.com/fcgi-bin/qm_reportlstmus.fcg?pcachetime=1466589548

Would it make sense to run squid in front of cuttle?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/mrkschan/cuttle/issues/16#issuecomment-227699647, or mute the thread https://github.com/notifications/unsubscribe/AAHNg4YsHBf-PJbS3b6Fu0UVlb0RP3_Kks5qOQjcgaJpZM4IyiYl .

[image: KS Chan on about.me]

KS Chan about.me/mrkschan http://about.me/mrkschan

andresdouglas commented 8 years ago

Seems like cache_peer should do the trick. Will try it in the morning http://www.christianschenk.org/blog/using-a-parent-proxy-with-squid/

I think I'll need to combine the above with a firewall that allows incoming on :3128, and then move cuttle to listen to :3129, which is blocked by the firewall, otherwise portscanner will still discover cuttle. Am I correct in thinking this, or is there a better way of making cuttle only listen for internal connections?

mrkschan commented 8 years ago

you can ask cuttle to listen to 127.0.0.1:3129 (accepting request from 127.0.0.1 only)

On Wed, Jun 22, 2016 at 6:49 PM, Andres Douglas notifications@github.com wrote:

Seems like cache_peer should do the trick. Will try it in the morning http://www.christianschenk.org/blog/using-a-parent-proxy-with-squid/

I think I'll need to combine the above with a firewall that allows incoming on :3128, and then move cuttle to listen to :3129, which is blocked by the firewall, otherwise portscanner will still discover cuttle. Am I correct in thinking this, or is there a better way of making cuttle only listen for internal connections?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/mrkschan/cuttle/issues/16#issuecomment-227708749, or mute the thread https://github.com/notifications/unsubscribe/AAHNg8vBE9eitWPRYemZwRz85KiZyB5Dks5qORMjgaJpZM4IyiYl .

[image: KS Chan on about.me]

KS Chan about.me/mrkschan http://about.me/mrkschan

andresdouglas commented 8 years ago

I think I almost got it...

I've gotten squid running, and I think forwarding requests to cuttle. Config file has this added to it:

debug_options ALL,9
acl whitelist dstdomain "/etc/squid3/whitelist.txt"

http_access allow whitelist
cache_peer 127.0.0.1 parent 3129 0 default

# And finally deny all other access to this proxy
http_access deny all

And although the requests get fulfilled, I'm not sure cuttle is getting them, so they may just be "allowed" and not forwarded. . Finally, I'll have to get the SSL cacert to be used by squid instead of cuttle as that will be the part communicating with the client, correct?

mrkschan commented 8 years ago

I'm not sure. Since Squid should not terminate SSL, your API client should be receiving the SSL cert from Cuttle.

mrkschan commented 8 years ago

FYI, @andresdouglas, I just bang together something in https://github.com/mrkschan/cuttle/pull/17 and it may resolve your issue. I didn't spend time test it yet though.