nodeSolidServer / node-solid-server

Solid server on top of the file-system in NodeJS
https://solidproject.org/for-developers/pod-server
Other
1.78k stars 302 forks source link

Solid server ignores content type #925

Open JornWildt opened 6 years ago

JornWildt commented 6 years ago

I have tried to upload various images to my POD and at some point I named it "x-jpg" instead of "x.jpg". It turns out that the solid server uses the filename extention to detect content type instead of relying on the actual content type.

Here is an example PUT operation showing the name "...270-jpg" and the content-type "image/jpeg":

PUT https://elfisk.solid.community/public/solidrc/images/Billede-30-10-2018-20.01.20-1541949270-jpg HTTP/1.1
Host: elfisk.solid.community
Connection: keep-alive
Content-Length: 257876
authorization: Bearer XXX
Origin: https://localhost:5001
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36
content-type: image/jpeg
Accept: */*
Referer: https://localhost:5001/Home/Models
Accept-Encoding: gzip, deflate, br
Accept-Language: da-DK,da;q=0.9,en-US;q=0.8,en;q=0.7,sv;q=0.6,nb;q=0.5
Cookie: connect.sid=XXX

Later on I do a GET on https://elfisk.solid.community/public/solidrc/images/ which returns the document below. You can see that ".jpg" files have type "jpeg:Resource" whereas the "-jpg" files does not.

@prefix : <#>.
@prefix im: <>.
@prefix ldp: <http://www.w3.org/ns/ldp#>.
@prefix terms: <http://purl.org/dc/terms/>.
@prefix XML: <http://www.w3.org/2001/XMLSchema#>.
@prefix st: <http://www.w3.org/ns/posix/stat#>.
@prefix jpeg: <http://www.w3.org/ns/iana/media-types/image/jpeg#>.

im:
    a ldp:BasicContainer, ldp:Container;
    terms:modified "2018-11-11T15:14:19Z"^^XML:dateTime;
    ldp:contains
        <Billede-30-10-2018-19.52.49-1541949020.jpg>,
        <Billede-30-10-2018-20.01.20-1541949270-jpg>;
    st:mtime 1541949259.98;
    st:size 4096.
<Billede-30-10-2018-19.52.49-1541949020.jpg>
    a jpeg:Resource, ldp:Resource;
    terms:modified "2018-11-11T15:10:11Z"^^XML:dateTime;
    st:mtime 1541949011.912;
    st:size 249480.
<Billede-30-10-2018-20.01.20-1541949270-jpg>
    a ldp:Resource;
    terms:modified "2018-11-11T15:14:22Z"^^XML:dateTime;
    st:mtime 1541949262.852;
    st:size 257876.

Later on the data browser goes beserk when you try to delete the "*-jpg" image as it is served as a turtle document:

GET https://elfisk.solid.community/public/solidrc/images/Billede-30-10-2018-20.01.20-1541949270-jpg HTTP/1.1

HTTP/1.1 200 OK
X-Powered-By: solid-server
Link: <Billede-30-10-2018-20.01.20-1541949270-jpg.acl>; rel="acl", <Billede-30-10-2018-20.01.20-1541949270-jpg.meta>; rel="describedBy", <http://www.w3.org/ns/ldp#Resource>; rel="type"
Content-Type: text/turtle
JornWildt commented 6 years ago

Seems like it could be related to this https://github.com/solid/node-solid-server/issues/413

Ryuno-Ki commented 6 years ago

According to the linked issue you're uusing mime-db via mime package. I suggest to not rely on the file extension, but on the first bytes (a.k.a. magic bytes) to determine the Content-Type.

Otherwise people could upload malware with an unsuspecting file extension...

kjetilk commented 6 years ago

This, indeed, sounds like a bad idea. There is a TAG finding on it too: https://www.w3.org/2001/tag/doc/mime-respect.html#dav-scenario We need to figure out why this was done.

kjetilk commented 6 years ago

@RubenVerborgh confirms, this will be fixed with #662

rimmartin commented 6 years ago

yea when application/octet-stream content type is set there is no effect; seems to be treated as a turtle file. the dependency https://github.com/jshttp/mime-types should know what it is. when read back from the pod the file doesn't match the original

Ryuno-Ki commented 6 years ago

This, indeed, sounds like a bad idea. There is a TAG finding on it too: https://www.w3.org/2001/tag/doc/mime-respect.html#dav-scenario We need to figure out why this was done.

From what I read there it deals with a mismatch about Content-Type and file extension. What I meant is more what file does.

If you want to learn more about Magic Bytes, read Magic Bytes – Identifying Common File Formats at a Glance.