Use buffer instead of cloned stream

Current implementation of CreateMediaArticle will halt on file larger than a certain size.

It is halt because await file.clone().buffer() never resolves.

The root cause is that when we call await file.clone().buffer(), it reads the cloned stream and put its data in an internal buffer so that the original stream can still consume data afterwards. The read operation halts when the internal buffer is full.

Related issue:

Analysis

In the current implementation we tried to use stream so that we don't need to load the entire file in memory nor in file system.

However, image-hash package requires a buffer as its input. Even if we provide a file path to it, it will read the entire file into memory (then send to JPEG decoder) anyway. Therefore, it is inevitable that we put the entire file in memory.

Proposed fix

This PR downloads the entire file from mediaUrl into a buffer and then pass to image-hash.

We set a max limit for the downloaded file (5MB for now) so that the API server will not run out of memory when someone send us a super large file. If a file bigger than the limit is provided, CreateMediaArticle will fail.

cofacts / rumors-api

Use buffer instead of cloned stream #279

Analysis

Proposed fix