mlsecproject / gglsbl-rest

Dockerized REST service to look up URLs in Google Safe Browsing v4 API
Apache License 2.0
75 stars 14 forks source link

ECS Configuration Settings #17

Closed summera closed 6 years ago

summera commented 6 years ago

@asieira would you mind explaining the configuration settings you've used to run gglsbl-rest on ECS? For example:

Any other information you think would be helpful to get started such as autoscaling settings would be great. ECS is pretty new to me.

Thanks in advance!

asieira commented 6 years ago

Right now I'm using EC2 with m4.large machines with an 8 Gb boot volume and a 22 Gb Docker storage volume. This is a larger cluster not used exclusively for gglsbl-rest, so even though I don't typically get multiple gglsbl-rest containers on the same host, the only real limitation for that will be the disk space. I'm setting the memory reservation for 2 Gb and the hard limit for 4 Gb.

That setup has given me zero problems handling large request volumes with two containers and very low response times:

image

image

Since Fargate allows to use up to 10 Gb of Docker layer storage (as per https://docs.aws.amazon.com/AmazonECS/latest/developerguide/AWS_Fargate.html#fargate-task-defs) it should be perfectly possible to run gglsbl-rest there as well. I would go for at least two CPUs and 4 Gb of RAM (so that most of the database file is cached in RAM).

If you try that out, let me know how it goes.

asieira commented 6 years ago

By the way, @afilipovich tells me that performance can be further improved by loading the database file in a RAM disk. I haven't tested it myself. So you might want to explore using the tmpfs Linux parameter for /home/gglsbl/db.

Again, if you get this to work please let me know how it works.

afilipovich commented 6 years ago

I can confirm that placing Sqlite file on tmpfs can increase performance several times compared to fast SSD. Compared to HDD it is an order of magnitude.

summera commented 6 years ago

@asieira @afilipovich Thanks a lot for the info! I messed around a bit with running it on a t2.medium but could only get one task running. A second wouldn't have enough resources on the same machine. Also tried with Fargate and 4GB of memory. You can't set tmpfs on Fargate and I'm not sure what it's using underneath.

Somewhat unrelated to this particular issue but are you seeing accurate results from the API when running this in production? Are you catching most of the spam or is a lot making it through? I'm attempting to protect a url shortener and as I mentioned in https://github.com/google/safebrowsing/issues/30#issuecomment-378805724 and https://github.com/google/safebrowsing/issues/30#issuecomment-378807286, the results I'm seeing from simple tests aren't great. I'm not sure whether this is because the results from the API are more up to date than https://transparencyreport.google.com or vice versa but if it's not catching much it doesn't seem worth the cost.

asieira commented 6 years ago

Interesting, didn't realize tmpfs was not available on Fargate. @rfranco did you know about this?

I do know those results can be different, what I can tell you is that I haven't noticed any problem when using the API. I do think the Transparency Report page uses more than just the Google Safe Browsing API data, though. Maybe @afilipovich has more info on that.

afilipovich commented 6 years ago

@summera, could you please provide a few URLs that show different results with gglsbl and https://transparencyreport.google.com ?

summera commented 6 years ago

@afilipovich no problem. The ones you see in the screenshots in https://github.com/google/safebrowsing/issues/30#issuecomment-378805724 and https://github.com/google/safebrowsing/issues/30#issuecomment-378807286 are two examples. I don't want to paste the urls directly as I've seen it send the github email notification to spam before.

asieira commented 6 years ago

@summera I will suggest that you open this as an issue directly in https://github.com/afilipovich/gglsbl and will close this one, ok? Hope the ECS guidance was able to help you ggslsbl-rest.

summera commented 6 years ago

@asieira yep, thanks for the help and info!