php-http / message-factory

Abandoned: Httplug Factory interfaces for PSR-7 HTTP Message
http://php-http.org
MIT License
1.38k stars 10 forks source link

An abstraction over how to create MultipartStream #31

Closed Nyholm closed 8 years ago

Nyholm commented 8 years ago

The PSR-7 includes a StreamInterface but there is nothing about MultipartStreams. This issue discuss possible approaches to create an abstraction over how to create MultipartStreams. The possible solutions and alternatives has been discussed in length at #30 and on Slack and this is a summary.

Background

A HTTP message with a multipart stream looks like this. As you can see it is many bodies separeted with a boundary.

POST / HTTP1/1
User-Agent: curl/7.21.2 (x86_64-apple-darwin)
Host: localhost:8080
Accept: */*
Content-Length: 1143
Expect: 100-continue
Content-Type: multipart/form-data; boundary=----------------------------83ff53821b7c

------------------------------83ff53821b7c
Content-Disposition: form-data; name="img"; filename="a.png"
Content-Type: application/octet-stream

?PNG

IHD?wS??iCCPICC Profilex?T?kA?6n??Zk?x?"IY?hE?6?bk
Y?<ߡ)??????9Nyx?+=?Y"|@5-?M?S?%?@?H8??qR>?׋??inf???O?????b??N?????~N??>?!?
??V?J?p?8?da?sZHO?Ln?}&???wVQ?y?g????E??0
 ??
   IDAc????????-IEND?B`?
------------------------------83ff53821b7c
Content-Disposition: form-data; name="foo"

bar
------------------------------83ff53821b7c--

The boundary

The boundary is what separates two streams in the request. It is usually just a random list of chars but in some special use case you may want to define your custom boundary. I guess a use case for a custom boundary is when the POST data is similar to the randomized boundary, so you want to be 100% sure that the boundary is never found within the POST body by providing a boundary yourself.

This is how you use the boundary when sending a request:

$multipartStream = getMultipartStream('somehow..');
$boundary = $multipartStream->getBoundary();

$request = MessageFactoryDiscovery::find()->createRequest(
  'POST',
  'http://example.com',
  ['Content-Type' => 'multipart/form-data; boundary='.$boundary],
  $multipartStream
);
$response = HttpClientDiscovery::find()->sendRequest($request);

Implementations

The only implementation of a MultipartStream is by Guzzle. Zend Diactoros has no support but we could create a MultipartStreamBuilder that are using Guzzle or Diactoros. See POC: https://github.com/Nyholm/MultipartStreamBuilder

A builder does the same thing as a factory but with a simpler API. Since we have factories for Streams, Uri, Request, Responses etc we should have e factory for MultipartStreams as well. The builder could, however, be an implementation detail in the DiactorosMultipartStreamFactory.

The problems

We (HTTPlug) want to create a MultipartStreamFactory that developers can rely upon to create a MultipartStream (Just as we do with the StreamFactory, UriFactory, RequestFactory etc..). Since there is no MultipartStreamInterface in PSR7 there is no clear solution and a lot of questions.

1) Parameters

What parameters should be used when calling MultipartStreamFactory::createMultipartStream? We ​need​ "formname", "resource", "headers", "filename" and possibly other “options”. Guzzle is using a multidimensional array. It is quite confusing but it does make sense.

    /**
     * @param array  $elements Array of associative arrays, each containing a
     *                         required "name" key mapping to the form field,
     *                         name, a required "contents" key mapping to a
     *                         StreamInterface/resource/string, an optional
     *                         "headers" associative array of custom headers,
     *                         and an optional "filename" key mapping to a
     *                         string to send as the filename in the part.
     * @param string $boundary You can optionally provide a specific boundary
     *
     * @throws \InvalidArgumentException
     */
    public function __construct(array $elements = [], $boundary = null)

What we also have discussed is:

1.a createMultipartStream(array $data)
1.b createMultipartStream(array $data, &$boundary = null)
2. createMultipartStream(...$part)
3. createMultipartStream(array $names, array $resources, array $headers, array $options)
4. createMultipartStream(AValueObject[] $data)
5. createMultipartStream(array &$options, ...$part)

Using an undefined number of parameters makes it difficult to call..: Since PHP 5.6 you can use ...$var to unpack variables. Using variadics (option 2 & 5) may look like this:

$files = [StreamInterfaces]
foreach ($files as $file) {
  // build a nice formatted $parts
}

$stream $factory->getMultipartStream($options, ...$parts);

_I believe the best approach is doing something like Guzzle_

2) Return value

The MultipartStream is just a special kind of StreamInterface. To use a MultipartStream we need the boundary. We cannot have a MultipartStreamFactory::getBoundary because factories should be stateless. We could either use a output parameter or return a value object that holds a StreamInterface and the boundary.

class MultipartStream
{
    private $stream;
    private $boundary;

    public function __construct(StreamInterface $stream, $boundary)
    {
        $this->stream = $stream;
        $this->boundary = $boundary;
    }

    public function getStream()
    {
        return $this->stream;
    }

    public function getBoundary()
    {
        return $this->boundary;
    }
}

_I believe the best approach is using a value object_

3) What resources to support?

The StreamFactory converts strings, StreamInterfaces and file resources to StreamInterfaces. Should the MultipartStreamFactory do the same? Should we only allow StreamInterfaces?

We have also discussed allowing paths to files. That would however lead to a lot more complex error handling and more potential exceptions.

_I believe the best approach is using the same resources as the StreamFactory_

4) Should MultipartStreamFactory extend StreamFactory

The Interface Segregation Principle says that no client should be forced to depend on methods it does not use. So the question we should ask ourselves:

Will a client that creates multipart streams also create normal streams?

One could also argue about DX. An implementation of MultipartStreamFactory will most likely be a StreamFactory as well. If a client needs both a MultipartStreamFactory and StreamFactory it would be weird to be forced to declare with two constrictor arguments for the same object.

public function __construct(StreamFactory $stream, MultipartStreamFactory $multipartStream)

_I believe the best approach is to let the MultipartStreamFactory extend the StreamFactory_

What to consider?

Does my/our suggestions make sense? Anything else we should consider? Would it be easy to use a MultipartStreamFactory like this?

sagikazarmark commented 8 years ago

Actually I am not entirely convinced that we need an abstraction for multipart streams. Cannot we have a custom stream implementation which supports any kind of underlying Stream implementations? So we have a MultipartStream (which is probably just a simple representation of multiple streams?), and a MutltipartStreamFactory which uses the already existing StreamFactories. At least the propositions above seem to be very similar to me, but it also tries to add an abstraction layer. Do we really need that? Can't we just have an implementation in the message package?

I have to admit I haven't followed this conversation closely, so I might not be right, also you might have discussed this already, but this is what I can see ATM.

I will take a closer look at the proposal and the discussions.

Nyholm commented 8 years ago

Im closing this for now. See https://github.com/php-http/message-factory/pull/30#issuecomment-233652431