snapframework / snap-core

Core type definitions (Snap monad, HTTP types, etc) and utilities for web handlers.
http://snapframework.com/
BSD 3-Clause "New" or "Revised" License
317 stars 85 forks source link

[WIP] FileUploads improvement #265

Closed sopvop closed 7 years ago

sopvop commented 7 years ago

This is a wip branch for review and bikeshedding. do not merge. Implements ideas discussed in #235

I've implemented foldMultipart and handleMultiparts in terms of it.

foldMultipart ::
       (MonadSnap m) =>
       UploadPolicy        -- ^ global upload policy
    -> PartFold a          -- ^ part processor
    -> a                   -- ^ seed accumulator
    -> m a

All tests seem to pass, but I have additional tests I want to add, though now it properly checks the number of uploaded form inputs I think at some point tests were passing even though there were no checks in place.

sopvop commented 7 years ago

Number of form limits test seem to be working ok. Now on high level api. I propose something simpler than handleFileUploads

data FileUploadPolicy = FileUploadPolicy
    { maxFileSize :: Int64
    , maxNumFiles :: Int
    }

handleFormData :: MonadSnap m 
         => UploadPolicy 
         -> FileUploadPolicy 
         -> (PartInfo -> InputStream ByteString -> a)
         -> m [(Maybe ByteString, a)]

storeAsLazyByteString :: PartInfo -> InputStream ByteString -> LB.ByteString

withTemporaryStore :: FilePath -> (PartInfo -> InputStream ByteString) -> LB.ByteString

myUploadsHandler  = do
   withTemporaryStore "/var/tmp" $ \store -> do
       files <- handleFormData defaultUploadPolicy defaultFileUploadPolicy store
       processMyFiles files

mySmallFilesHandler = do
     let filePolicy = setMaximumFileSize 1024 defaultFileUploadPolicy
     filesaslbs <- handleFormData defaultUploadPolicy filePolicy storeAsLazyByteString

Another API thing which came up with files was people requesting that form data was not inserted into rqParams and instead was as a separate dict.

Alternatively we can instead change foldMultipart to not post process form inputs into rqParams and return them as list, like so

data FormParam = FormParam !ByteString !ByteString
foldMultipart ::
       (MonadSnap m) =>
       UploadPolicy        -- ^ global upload policy
    -> PartFold a          -- ^ part processor
    -> a                   -- ^ seed accumulator
    -> m ([FormParam], a)

But old handleUploads api would still store them as rqParams.

Thoughts?

mightybyte commented 7 years ago

This is looking nice. I think I would prefer returning the list explicitly to setting up a separate dict.

@gregorycollins?

sopvop commented 7 years ago

Implemented high level interface, which is used like this:

myFormHandler = withTemporaryStore "/var/tmp" "upload.tmp" $ \store -> do
    (params, files) <- handleFormUploads defaultUploadPolicy defaultFileUploadPolicy store
    processFiles files

smallFileUploads = do
    let filePlicy = setMaximumFileSize (512*1024) defaultFileUploadPolicy
    (params, files) <- handleFormUploads defaultUploadPolicy filePolicy storeAsLazyByteString
    processByteStrings files

High level API

type FormFile a = (Maybe ByteString, a)
type FormParam = (ByteString, ByteString)

handleFormUploads ::
       (MonadSnap m) =>
       UploadPolicy                   -- ^ general upload policy
    -> FileUploadPolicy               -- ^ Upload policy for files
    -> (PartInfo -> InputStream ByteString -> IO a)
                                      -- ^ A file storage function
    -> m ([FormParam], [FormFile a])

storage helpers

storeAsLazyByteString :: PartInfo -> InputStream ByteString -> IO LB.ByteString

withTemporaryStore ::
    MonadSnap m
    => FilePath -- ^ temporary directory
    -> String   -- ^ file name pattern
    -> ((PartInfo -> InputStream ByteString -> IO FilePath) -> m a)
      -- ^ Action taking store function
    -> m a

config

data FileUploadPolicy = FileUploadPolicy { ... }

setMaximumFileSize :: Int64 -> FileUploadPolicy -> FileUploadPolicy
setMaximumNumberOfFiles :: Int -> FileUploadPolicy -> FileUploadPolicy
setSkipFilesWithoutNames:: Int64 -> FileUploadPolicy -> FileUploadPolicy
setStoreFilesWithoutNames :: Int64 -> FileUploadPolicy -> FileUploadPolicy

Why "skipFilesWithoutNames" is needed

The setSkipFilesWithoutNames is useful to use with html5 forms. According to html5 spec, form inputs with type file if not set are encoded as files without name, with empty body and mime type of application/octet-stream. Default FileUploadPolicy allows empty files with zero length, it should fit this use-case. Note that older browsers may omit such fields from encoded form-data, so this behaviour is desireable.

defaultFileUploadPolicy

maxFileSize is 1mb maxNumberOfFiles is 10, but probably 1 would be a safer behaviour Files without names are skipped with maximum size 0 (zero).

Tests

I think everything is reasonable tested, but If you think of any corner cases I would gladly add tests for it.

I would like to get as much bike-shedding as possible before I do changes to module documentation.

mightybyte commented 7 years ago

I haven't had time to look at this in-depth, but I like what I'm seeing. Hopefully @gregorycollins will be able to take a look at it soon.

sopvop commented 7 years ago

Sorry for long delay. But at least I've figured out possible problems with api.

Reworked this api quite a bit. Folded all the empty name policy options into FileUploadPolicy. Now it is can be configured with:

setMaximumFileSize :: Int64 -> FileUploadPolicy -> FileUploadPolicy
setMaximumNumberOfFiles :: Int -> FileUploadPolicy -> FileUploadPolicy
setSkipFilesWithoutNames :: Bool -> FileUploadPolicy -> FileUploadPolicy
setMaximumSkippedFileSize :: Int64 -> FileUploadPolicy -> FileUploadPolicy

If setSkipFilesWithoutNames is False, then the setMaximumSkippedFileSize is ignored and setMaximumFileSize is used as for normal files.

handleFormUploads now returns a specialized pair

data FormFile a = FormFile
    { formFileName  :: !ByteString
         -- ^ Name of a field
    , formFileValue :: a
         -- ^ Result of storing file
    } deriving (Eq, Ord, Show)

I guess I should also update module docs.

gregorycollins commented 7 years ago

Great! We should plan to deprecate the old functions if the new ones offer a strict superset of the functionality.