structured-data / linter

Structured Data linter
The Unlicense
84 stars 16 forks source link

report: implementing SDL on AWS/Lambda #52

Open ankitadhandha opened 4 years ago

ankitadhandha commented 4 years ago

Structured Data Linter on AWS Lambda

See SDL on AWS/Lambda

Challenge 1:

Ruby creates native extensions written in C. How to build native extensions requires compilation of the C code into the platform and environment specific machine language code?

Solution 1:

gkellogg commented 4 years ago

Structured Data Linter on AWS Lambda

See SDL on AWS/Lambda

Challenge 1:

Ruby creates native extensions written in C.

Solution 1:

  • We compiled the extensions on the same environment as the AWS/Lambda machine.
  • We used lambci/lambda: build-ruby2.5 version docker image -- the same environment as used by AWS.

Those extensions are going to principally include nokogiri and nokogumbo for processing RDFa, RDF/XML and Microdata. The gems will work without it (with some performance impact), and you may not need to support these serializations at all.

The hard requirement for nokogiri and nokogumo comes from the linkeddata gem (via sinatra/linkeddata); you might consider a version of the linter that doesn't use that gem, but just pulls in specific gems you need.

Perhaps we could consider versions of rack-linkeddata and sinatra-linkeddata which don't have a hard require of "linkeddata", so that you could better control this and still get the Rack/Sinatra support. Perhaps easiest would be to predicate the "require 'linkeddata'" line in rack-linkeddata to either allow the gem load to fail, or to use an environment variable.

Challenge 3:

AWS Lambda only supports Deployment unzipped size of 250 MB. See Lambda payload limits Our deployment package is ≥ 335 MB. How to fit the SDL deployment package into AWS/Lambda?

Solution 3:

  • We removed several items to conform to AWS/Lambda constraints.

See above.

Challenge 4:

AWS/Lambda limits Response/Request payload size See Lambda payload limits

Solution 4:

  • We used AWS spot server and modified SDL code to support .zip file in "Linter By Upload" option.

A PR to add such support to SDL might ease integration.

Challenge 5:

How to post or get request to API Gateway endpoint and successfully run the get/post request?

Solution 5:

  • We modified API Gateway settings to add a POST method.
  • We modified application.js link to point to a different location.
  • We modified self link to API Gateway endpoint link.

A PR to add such support to SDL might ease integration.

Challenge 6

You recently revised SDL. The new size is approximately 400MB.

Solution 6 (pending)

We need your guidance for removing libraries or other compaction techniques to reduce the new version to 250 MB.

See above.

Challenge 7

How to setup API Gateway Endpoint such that it can be invoked just from a specific website?

Solution (pending)

  • Working on it.

Follow up

If you have private questions or suggestions please contact me at ankitadhandha@gmail.com ankita