maidsafe / sn_browser

Safe Network Browser Application
BSD 3-Clause "New" or "Revised" License
201 stars 55 forks source link

Cannot fetch content with a XOR-URL which typeTag is greater than 65535 #429

Open bochaco opened 5 years ago

bochaco commented 5 years ago

The browser considers a XOR-URL invalid when the typeTag part is greater than 65535 e.g. safe://hyfktcegr68th4og68y9eks99cie14qtw44zpmjdttqqujxr3bb5jnnip7r:160485 triggers error Invalid URL: safe://hyfktcegr68th4og68y9eks99cie14qtw44zpmjdttqqujxr3bb5jnnip7r:160485

joshuef commented 5 years ago

So i've done some digging around, and this is for sure not even reaching any webRequest we can hook into.

So it looks, we file a bug on the basis that this isn't expected behaviour (as technically port numbers can be larger based on the spec).

BUT, this behaviour exists in chrome (larger ports are just treated as search queries and no attempt is made to resolve them...), so I doubt very much that it'll get much traction.

SOOOOOOOOooooo...

How do we want to handle this? We can attempt an ugly workaround... catch the errors... check if its a BIG port number/XOR url, then fetch it, and server that from another URL, which we then mask with modifying headers etc...

Or... something else?

hunterlester commented 5 years ago

@bochaco @joshuef I'm confused, looking at the TCP RFC, page 15, a port can only be a 16-bit value, right? Which spec refers to port numbers larger than 2**16 - 1?

hunterlester commented 5 years ago

Ah I see, port range not specified in URL spec: https://www.w3.org/Addressing/URL/url-spec.txt It seems right though to assume a 16-bit port number as defined by TCP/IP and that we shouldn't conflict with it.

When encoding the XOR-URL, I think we need to change from using a colon to specify MData type tags to another character such as search query. ?, such that safe://hyfktcebyxbzht6j6o7fcwbbjh4o3chp16131q437t3kst5foer6ehzchny?150001.

hunterlester commented 5 years ago

The encoding ues multiformats such as multibase, should we also have a multiformats for our network data types so that, for example, the encoding specifies whether it represents an MData and should have a type tag or whether it's an IData and should not have a type tag?

joshuef commented 5 years ago

I think the multiformats has already been covered in the current implementation @hunterlester . It's definitely included in the deps now. Though not sure if it's used for exactly what you're describing.

I'm all for an alternative char I think. ?: could this be confused with query params by a webapp? Maybe something still valid in a url, but not used as part of URL structs in general.

From the URL spec:


  alphanum2              alpha | digit | - | _ | . | +  

  xalpha                  alpha | digit | safe | extra | escape   

  xalphas                 xalpha [ xalphas ]   

  xpalpha                 xalpha | +   

  xpalphas                xpalpha [ xpalphas ]   

  ialpha                  alpha [ xalphas ] 

  alpha                   a | b | c | d | e | f | g | h | i | j | k |
                         l | m | n | o  | p | q | r | s | t | u | v |
                         w | x | y | z | A | B | C  | D | E | F | G |
                         H | I | J | K | L | M | N | O | P |  Q | R |
                         S | T | U | V | W | X | Y | Z   

  digit                   0 |1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9   

  safe                    $ | - | _ | @ | . | &  | + | - 

  extra                  ! | * |  " |  ' | ( | )  | , 

  reserved                 =  |  ;  |  /  |  #  | ? |  : | space 

  escape                  % hex hex   

  hex                     digit | a | b | c | d | e | f | A | B | C |
                         D | E | F   

  national                { | } | vline | [ | ] | \ | ^ | ~   

  punctuation             < | > 

  digits                  digit [ digits ]   

  alphanum                alpha | digit   

  alphanums               alphanum [ alphanums ] 

From that I'd suggest not going for something reserved. I think what we need is an unreserved char, that's not present in the base32 implementation. So how's about ! ?

safe://hyfktcebyxbzht6j6o7fcwbbjh4o3chp16131q437t3kst5foer6ehzchny!150001