ipfs / kubo

An IPFS implementation in Go
https://docs.ipfs.tech/how-to/command-line-quick-start/
Other
16.18k stars 3.01k forks source link

WebUI should (somehow) indicate a problematic directory entry #4292

Open mib-kd743naq opened 7 years ago

mib-kd743naq commented 7 years ago

Version information:

Any

Type:

Enhancement ( arguably Bug )

Severity:

Medium

Description:

When faced with a duplicate name in a directory, the UI does not signal in any way that something is amiss. There should be some sort of indication ( a red exclamation mark or something ) to visually alert the user "hey, watch out". I am not able to build FUSE locally, so I do not know how things behave there: likely needs to be investigated as well.

Here is an example DAG: https://ipfs.io/ipfs/zdjA8bKXBUttGJ7aXtG3DeuHYtAVXNj1jA5sKuscTJ1nk1xL9

This issue is tangentially related to https://github.com/ipfs/go-ipfs/issues/1710. There is also a somewhat explicit policy in the IPLD spec, but if the UI doesn't show anything about it - this is still a problem.

P.S. In case this is deemed a security issue that should not be public - feel free to delete this bugreport. I tried to use the security@ contact two times, but never received a response, so decided to file here before I move on.

Stebalien commented 7 years ago

We should definitely be removing the duplicate entries.


As for the security implications, this definitely isn't a good bug to have but would probably be hard to exploit. The (really) worrying part is that you didn't get a timely response; I've opened up an internal discussion to figure out what happened there.


Note: That part of the IPLD spec is only tangentially related. IPFS directories have a very different structure.

mib-kd743naq commented 7 years ago

We should definitely be removing the duplicate entries.

Only if there are guarantees within all readers ( http, cat, fuse ) that the first encountered "named link" is the one that wins. Otherwise the problem will only get worse.

Stepping back a bit - I was really hoping for actual marking of all duplicates or semi-duplicates as "Hey! Something is fishy". The thing I constructed is a simple bit-for-bit duplication. However, when you consider all the nastiness within https://github.com/ipfs/go-ipfs/issues/1710, just hiding an offender is likely insufficient, as it will miss:

The (really) worrying part is that you didn't get a timely response

It's growing pains within the team, it is understandable. If anything - this is a relatively low profile issue, so having the report-loop tested with that is way better than some sort of RCE ;)

kpcyrd commented 7 years ago

The byte, that can be rigged to make ipfs ls output look... different

This sounds like terminal control characters might go through as well. Maybe worth a try. A program that prints non-binary to stdout should never print special control characters, partly because of this, but also because it's sort-of an injection problem if stdout is piped to stdin of another program that expects a line separated "protocol". I usually never print user input without repr() or {:?} because of this. Did you test for newline?

Regarding duplicate entries: I think the \x00 byte might cause a truncation (depending on if there are null byte checks before go calls open(2)). Duplicate entries have been exploited on zip for android, I don't think I can recall all details, but the problem was that the first entry wins if you access the zip file with the library, but the last entry wins if you write it to disk, since the 2nd entry would just overwrite the first. For directories, it's probably going to merge them.

mib-kd743naq commented 7 years ago

@kpcyrd I didn't even think to test this, but yes ( example outputs color codes only, is safe to ls ): https://ipfs.io/ipfs/QmZcYTLQNo1hJtGxQD8d9sHtmnBBamhDHydtqJ2zxzLSx7

~$ ipfs ls QmZcYTLQNo1hJtGxQD8d9sHtmnBBamhDHydtqJ2zxzLSx7 
QmeHXsYCtWa9KhC5ZME5pn97c1zTt7TgK37VgKJWWAdknn 9 
===========
RED ALERT
===========
~$ ipfs ls QmZcYTLQNo1hJtGxQD8d9sHtmnBBamhDHydtqJ2zxzLSx7 | hexdump -C
00000000  51 6d 65 48 58 73 59 43  74 57 61 39 4b 68 43 35  |QmeHXsYCtWa9KhC5|
00000010  5a 4d 45 35 70 6e 39 37  63 31 7a 54 74 37 54 67  |ZME5pn97c1zTt7Tg|
00000020  4b 33 37 56 67 4b 4a 57  57 41 64 6b 6e 6e 20 39  |K37VgKJWWAdknn 9|
00000030  20 0a 3d 3d 3d 3d 3d 3d  3d 3d 3d 3d 3d 0a 1b 5b  | .===========..[|
00000040  39 37 6d 1b 5b 35 6d 1b  5b 34 31 6d 1b 5b 35 6d  |97m.[5m.[41m.[5m|
00000050  52 45 44 20 41 4c 45 52  54 1b 5b 35 6d 1b 5b 30  |RED ALERT.[5m.[0|
00000060  6d 1b 5b 35 6d 0a 3d 3d  3d 3d 3d 3d 3d 3d 3d 3d  |m.[5m.==========|
00000070  3d 0a                                             |=.|
00000072
mib-kd743naq commented 7 years ago

@Stebalien All the problems with non-printables aside - there is also a length-based DoS: http://ipfs.io/ipfs/QmVtFwE9CWfBk7WC2pCgPtMQsZ7zF6tgjbL5Vcpte7pLcy

Trying to follow the longer-named links results in either a 414 ( decent ) or 502 ( ugh ) on the gateway. On the CLI things are no better:

~$ ipfs ls QmVtFwE9CWfBk7WC2pCgPtMQsZ7zF6tgjbL5Vcpte7pLcy
Qme57pNu4VvdX5yZdVacxcr4VyxqtgoymeKGPzRmmm121Q 26 This string can be very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very very long
QmcMeKrhJXNcyExPtAYKUfbJpn6WGweR7KbyV6CRMbD3Rm 29 This subdir is *really* large - do not list its contents/
~$ ipfs ls 'QmVtFwE9CWfBk7WC2pCgPtMQsZ7zF6tgjbL5Vcpte7pLcy/This subdir is *really* large - do not list its contents/I mean really - this will dump 1mb of nyancat on your screen...' | wc -c
1114145
Stebalien commented 7 years ago

Only if there are guarantees within all readers ( http, cat, fuse ) that the first encountered "named link" is the one that wins. Otherwise the problem will only get worse.

We'd do this at the lowest level so yes, it would cover all cases. I'd like to just reject such directories outright but directory sharding makes that impossible.

Stepping back a bit - I was really hoping for actual marking of all duplicates or semi-duplicates as "Hey! Something is fishy". The thing I constructed is a simple bit-for-bit duplication.

I'd like to handle this stuff at a layer below the UI. Users should (almost) never be asked to deal with security related things because they'll generally just be confused.

The thing I constructed is a simple bit-for-bit duplication. However, when you consider all the nastiness within #1710, just hiding an offender is likely insufficient, as it will miss:

So, we initially wanted Unix compatibility but, as you can see in that thread, sentiments have changed. Limiting unixfs paths to printable, UTF-8 byte sequences is reasonable.

there is also a length-based DoS

For long paths, we've considered limiting the length of individual components in a path (not entire paths, just components). However, we haven't picked a good limit. Currently, there's a hard limit of slightly less than 1MiB (due to other constraints) but we almost definitely want something shorter (4096?). Gateways can choose to reject any paths they want and we should probably set a "paths longer than X may not work, avoid them" limit.

As for large directories, that's something we can't fix. We want to, e.g., support dumping Wikipedia into IPFS (and every article on Wikipedia lives in /wiki). However, we do need to stream directories when listing them instead of blocking (I can't find the issue but I'm pretty sure there is one about this).


If you're feeling adventurous, want to read through #1710 and take a stab at a reasonable path restriction proposal?

mib-kd743naq commented 7 years ago

If you're feeling adventurous, want to read through #1710

I have read it, multiple times: I was the one who linked it into the thread ;)

take a stab at a reasonable pathrestriction proposal?

I currently can't commit to that, due to severe time shortage ( sorry :/ ). I will only make a rather off-the-cuff remark about something I think has not been explicitly explored:

It seems that the question of "what fundamentally is the UnixFS portion of IPFS, and what naming rules apply" remains unresolved ( that's really a question for @jbenet and friends )

A different way of asking the above question is:

An explicit answer to this will inform where the rest of this ticket goes. Furthermore this answer must be provided and codified sooner rather than later, due to the accelerating growth of 3rd party implementations

Kubuxu commented 7 years ago

@mib-kd743naq my question is: it it bad that this can happen? and how does for example ls command work with it. Unix tooling is a great resource as they had to figure it out long ago and had a lot of time to think about it, it is still a resource and we might want to make some changes.

mib-kd743naq commented 7 years ago

my question is: it it bad that this can happen?

@Kubuxu the problem is that I do not know :/

Intuitively it is a rather nasty situation when "input arriving from the internet" is able to manipulate your cursor position and essentially redraw the entirety of the terminal. There is something about IPFS being directly connected to the net that makes the scenario less palatable than say a .tar.gz with the same content being tar -ztf-ed ( though my version of tar handles this perfectly, see below ) .

Not having a clear and well thought through answer to "is this a problem", is why I sidestepped this question entirely, and instead put together the "which one of the two" writeup above. It seems clearer from that standpoint that IPFS can't be both "fully POSIX compliant" and "an upgraded protocol with less cruft".

For me personally - "Fully POSIX compliant" is the answer, with the ( significant ) work needed to make the tooling behave better. But I really want to defer this to @jbenet in part because of his proven visionary track record, and in part because he has extensively thought through this already ( or so I hope ).

how does for example ls command work with it

Way better than I expected:

/dev/shm$ ls fooo/
?[97m?[5m?[41m?[5mRED ALERT?[5m?[0m?[5m?
/dev/shm$ tar -tf fooo.tar 
fooo/
fooo/\033[97m\033[5m\033[41m\033[5mRED ALERT\033[5m\033[0m\033[5m\n
Kubuxu commented 7 years ago

As we are working on https://github.com/ipfs/ipld-unixfs/issues/1, if the decision is to allow all link names then best part forward (I think) would be to also to spec out way of escaping them for the user.

It seems clearer from that standpoint that IPFS can't be both "fully POSIX compliant" and "an upgraded protocol with less cruft".

I disagree with this statement, what is stored (and can be accessed through the API) versus what is presented to the user are two things that can be different. As most of shells are not about to handle binary strings (as variables) it makes sense to have encoding in presenting them.

mib-kd743naq commented 7 years ago

@Kubuxu in light of

Currently, IPLD specifies that all map/object key names must be Unicode text ( via https://github.com/ipfs/ipld-unixfs/issues/3#issuecomment-341289386 )

does my statement about "IPFS/IPLD can't be both at the same time" make more sense?