misalcedo / tortuga

A CGI and WCGI server for HTTP/1.1
Apache License 2.0
7 stars 0 forks source link

Pivot to a WASM-based CGI server. #114

Closed misalcedo closed 8 months ago

misalcedo commented 8 months ago

The CGI protocol is a great way to map HTTP requests to system processes (environment variables, stdin, stdout, stderr and arguments). Using the same protocol we can map HTTP requests to WASI invocations.

### Tasks
- [x] Implement script selection instead of just supporting single script. In doing do, implement PATH_INFO and PATH_TRANSLATED
- [x] Support passing a server name in some manner. May be able to reporpuse interface for this
- [x] Expose a script URI, script name and server name
- [x] Parse the URI query string as command-line arguments
- [x] Ensure hyper discards response bodies on HEAD
- [x] Ensure the script cannot cause the server to return a malformed HTTP response
- [x] Add routing for static assets and CGI scripts
- [x] Validate script behavior for CGI response by adding support for document, client redirect and client redirect with document responses
- [x] Add support for local redirects (without forwarding the body and limiting the number of loop iterations).
- [x] Add support for WASI CGI scripts
- [ ] Recommendations for servers and example scripts
- [ ] Security considerations
- [ ] Add support for content encodings and prevent the script from changing headers necessary for communicating correctly with the client
- [ ] Add support for HTTPS scheme and expose the scheme to the script
- [ ] Non-Parsed Header script support
- [ ] Validate and improve URI encoding support and query param parsing
- [ ] Allow running CGI scripts from anywhere in the document root directory.
- [ ] Handle the PATH and PWD variables correctly for WCGI
- [ ] Handle non-UTF-8 header values

Documentation

misalcedo commented 8 months ago

I already have a basic CGI server. Clearing the task list and starting a new one.

misalcedo commented 8 months ago
misalcedo commented 8 months ago

I finished implementing support for loading any CGI script from the CGI bin directory. Also, I added a document root to be used to resolve relative CGI bin paths, used as working directory for scripts and for path translation.

The assert.cgi script now validates all non-protocol meta variables are passed correctly.

misalcedo commented 8 months ago

I now pass the server name from the command line (resolved on server start) to the script.

misalcedo commented 8 months ago

The script name and server name are both passed to the script. However, script URI also requires knowing the scheme.

misalcedo commented 8 months ago

The plan is to have the server context maintain a scheme variable. The scheme will just be "http" for now and will support "https" once I add TLS support.

misalcedo commented 8 months ago

The script now gets a valid script URI verified in assert.cgi.

misalcedo commented 8 months ago

The query string is now parsed as command-line arguments.

misalcedo commented 8 months ago

Hyper does not discard bodies on HEAD requests. I created a router that implements Service (instead of the CGI client). The router drops the body on HEAD requests and maps io errors to HTTP status codes.

I tried passing a null stdout to the child process, but that means the script can't set any headers which is the entire point of HEAD requests.

misalcedo commented 8 months ago

Implemented a static asset server that loads files fully into memory before returning the contents. The server prevents loading of files with the cgi extension statically (i.e. must be loaded dynamically).

misalcedo commented 8 months ago

I added tests for all 4 types of responses, including local redirection. The local redirection test currently fails since I just return an error response saying "unsupported".

misalcedo commented 8 months ago

Added logic to ensure redirects have empty bodies.

misalcedo commented 8 months ago

I added support for local redirects by moving the body collection logic to the router. Then, the CGI script is invoked with a body of Bytes, which is cheap to clone allowing replaying of the request body. Finally, I updated the request's URI's path and query from the location header and was able to get both local redirect tests to pass.

misalcedo commented 8 months ago

Closing in favor of tracking improvements in another issue.