google / physical-web

The Physical Web: walk up and use anything
http://physical-web.org
Apache License 2.0
6k stars 665 forks source link

URL Limit- 495 Characters #753

Open kevinahuber opened 8 years ago

kevinahuber commented 8 years ago

The PWS is currently limiting URL's at 495 characters. At this point, I get the following error:

  File "/base/data/home/runtimes/python27/python27_lib/versions/third_party/webapp2-2.5.2/webapp2.py", line 1535, in __call__
    rv = self.handle_exception(request, response, e)
  File "/base/data/home/runtimes/python27/python27_lib/versions/third_party/webapp2-2.5.2/webapp2.py", line 1529, in __call__
    rv = self.router.dispatch(request, response)
  File "/base/data/home/runtimes/python27/python27_lib/versions/third_party/webapp2-2.5.2/webapp2.py", line 1278, in default_dispatcher
    return route.handler_adapter(request, response)
  File "/base/data/home/runtimes/python27/python27_lib/versions/third_party/webapp2-2.5.2/webapp2.py", line 1102, in __call__
    return handler.dispatch()
  File "/base/data/home/runtimes/python27/python27_lib/versions/third_party/webapp2-2.5.2/webapp2.py", line 572, in dispatch
    return self.handle_exception(e, self.app.debug)
  File "/base/data/home/runtimes/python27/python27_lib/versions/third_party/webapp2-2.5.2/webapp2.py", line 570, in dispatch
    return method(*args, **kwargs)
  File "/base/data/home/apps/s~url-caster/1.392839177079585096/handlers.py", line 78, in post
    output = helpers.BuildResponse(objects)
  File "/base/data/home/apps/s~url-caster/1.392839177079585096/helpers.py", line 69, in BuildResponse
    siteInfo = GetSiteInfoForUrl(url, distance, force_update)
  File "/base/data/home/apps/s~url-caster/1.392839177079585096/helpers.py", line 169, in GetSiteInfoForUrl
    siteInfo = FetchAndStoreUrl(siteInfo, url, distance, force_update)
  File "/base/data/home/apps/s~url-caster/1.392839177079585096/helpers.py", line 221, in FetchAndStoreUrl
    return GetSiteInfoForUrl(final_url, distance, force_update)
  File "/base/data/home/apps/s~url-caster/1.392839177079585096/helpers.py", line 166, in GetSiteInfoForUrl
    siteInfo = models.SiteInformation.get_by_id(url)
  File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/ext/ndb/utils.py", line 160, in positional_wrapper
    return wrapped(*args, **kwds)
  File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/ext/ndb/model.py", line 3602, in _get_by_id
    return cls._get_by_id_async(id, parent=parent, **ctx_options).get_result()
  File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/ext/ndb/tasklets.py", line 383, in get_result
    self.check_success()
  File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/ext/ndb/tasklets.py", line 430, in _help_tasklet_along
    value = gen.send(val)
  File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/ext/ndb/context.py", line 718, in get
    mkey = self._memcache_prefix + key.urlsafe()
  File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/ext/ndb/key.py", line 561, in urlsafe
    urlsafe = base64.b64encode(self.reference().Encode())
  File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/ext/ndb/key.py", line 546, in reference
    namespace=self.__namespace)
  File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/ext/ndb/utils.py", line 160, in positional_wrapper
    return wrapped(*args, **kwds)
  File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/ext/ndb/key.py", line 673, in _ConstructReference
    reference = _ReferenceFromPairs(pairs, app=app, namespace=namespace)
  File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/ext/ndb/key.py", line 766, in _ReferenceFromPairs
    (_MAX_KEYPART_BYTES, idorname))
ValueError: Key name strings must be non-empty strings up to 500 bytes; received http://www.google.com/?bdfuihasdiuyfghuyshfuyagsdhauifhaiuosgfiuashfuijasifugasifaiushjfiuahsufiuasfjasfhuafsdhiufsduihsdfiuhfsdhuifsduhifeshuifsduhifsdiuhsdfuihdsfhiudfsuhidfsuihdfshuidfsiuhdsfhiusdfhiudsfhuidfshiudfshiudfshiudsfhuidfshiudfshuidfsiuhdfsiuhiudfshiufdshuidfshiudfsuhisdfiuhdfhsuiihufdshiudfshiudfshiubdfuihasdiuyfghuyshfuyagsfuhasuifhaiuosgfiuashfuijasifugasifaiushjfiuahsufiuasfjasfhuafsdhiufsduihsdfiuhfsdhuifsduhifeshuifsduhifsdiuhsdfuihdsfhiudfsuhidfsuihdfshuidfsiuhdsfdsdsdasdsad=

This (looks) like it is from the PWS caching service using the URL as the key within the ndb, where the key is limited by 500 characters.

While my example url is a bit ridiculous, 500 seems quite low- we can go up to 2kb through urlfetch, and most browsers go even higher than that.

Thoughts on how to move forward? I am happy to contribute on this!

oahziur commented 8 years ago

I think @mmocny knows more about the open source PWS.

Doing a quick check, it seems we are not defining a key's property in the data model. Is it possible to define key=ndb.TextPropety() inside the cache model? Otherwise, we can index the url field.

kevinahuber commented 8 years ago

@mmocny Do you have any additional guidance on this? I'll go ahead and try to tackle it.

scottjenson commented 7 years ago

@oahziur @mmocny any comment on this?

rochforp commented 7 years ago

@scottjenson @mmocny has there been any movement on this? I'd like for this limit to be upped to ~1000 characters if possible. There are some instances where my URL's are getting very long so I'm using a shortener but they are getting ignored in PhyWeb and Chrome because of the PWS URL character limit. However, Magnet does see these longer urls with their pws and browser so I just wanted to check and see what Chrome was gonna do before I continue with some projects.

mmocny commented 7 years ago

Folks, this is a bug in the open source (Python + App Engine) PWS which in this repo.

The URL character length limit for Google's internal PWS is 2083, and that's what is used by Chrome and Nearby), and that is the URL length limit for browsers as well.

I can try to take a stab at fixing the python PWS later this week. Or, maybe someone would like to make a patch? Great opportunity to make a quick contribution!

kevinahuber commented 7 years ago

@mmocny Ah this fell by the wayside. I'd love to tackle this but I am swamped with work this weekend- I can in a few weeks, or can help test if someone else wants to take lead.

mmocny commented 7 years ago

No hurry! Thanks for offering. I'll still see if I can take a stab soon.

On Wed, Nov 16, 2016 at 2:45 PM Kevin notifications@github.com wrote:

@mmocny https://github.com/mmocny Ah this fell by the wayside. I'd love to tackle this but I am swamped with work this weekend- I can in a few weeks, or can help test if someone else wants to take lead.

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/google/physical-web/issues/753#issuecomment-261050815, or mute the thread https://github.com/notifications/unsubscribe-auth/AAc8qga2BbzMzeUtJLFORmP5MOae7dqAks5q-11UgaJpZM4JYjqa .

rohinrohin commented 7 years ago

@mmocny @kevinahuber can you guide me as to how I can solve this issue? I am proficient in python and a very quick learner :)

kevinahuber commented 7 years ago

@rohinrohin Excellent! As far as I understand... the PWS stores "site information" using the url as the key. This is an issue, since the key has a limit of 500 characters (this is where my knowledge evaporates). https://github.com/google/physical-web/blob/master/web-service/models.py

We'd want all URLs to have a minimum of 2048 characters, as that is the lowest limit most technologies have.