erlyaws / yaws

Yaws webserver
https://erlyaws.github.io
BSD 3-Clause "New" or "Revised" License
1.28k stars 267 forks source link

convenient handling of large POSTs/PUTs? #312

Open peffis opened 7 years ago

peffis commented 7 years ago

When Yaws receives large POSTs and PUTs they are split into several calls to your out handler in your appmod. To handle the request you end up implementing a state machine - something like:

out(#arg{state=undefined, clidata = Data} = A)
  when is_binary(Data) -> %% size of post < partial_post_size
    InitialState = init_state(),
    finalize_upload(process_data(Data, InitialState), A);

out(#arg{state=undefined, clidata = {partial, Data}})
  when is_binary(Data) -> %% first piece of chunked upload
    InitialState = init_state(),
    {get_more, undefined, process_data(Data, InitialState)};

out(#arg{state = State, clidata = {partial, Data}})
  when is_binary(Data) -> %% a piece in a chunked upload, neither first nor last
    {get_more, undefined, process_data(Data, State)};

out(#arg{state = State, clidata = Data} = A)
  when is_binary(Data) -> %% last piece of chunked upload
    finalize_upload(process_data(Data, State), A);

If you have several places in your code where you handle large POSTs and PUTs you end up copying around this state machine with some minor modifications of init_state, process_data and finalize_upload.

How are people in general handling this (not too worrying) code smell? When these four clause-groups occur too many times within the same project I usually end up making some parameterized function, like:

out(A) ->
  handle_large_body(A, fun init_state/2, fun process_data/4, fun finalize_upload/3).

, which reduces the four clauses to one clause and allows some reuse of the state machine handling large uploads.

Is this in general how people handle this code smell or is there some other more conveniant handling of large POSTs and PUTs in the API that I have not noted yet?

vinoski commented 7 years ago

You'd probably be better off asking this in the Yaws mailing list, since more users will see it there than here, but I don't think you're missing any convenience functions, and I don't see anything wrong with the approach you're taking to reuse your functions.