google / physical-web

The Physical Web: walk up and use anything
http://physical-web.org
Apache License 2.0
5.99k stars 665 forks source link

URL Checker #671

Closed russellmorton closed 8 years ago

russellmorton commented 8 years ago

Is there a tool to check if a URL is Physical Web-compatible, particularly within the Chrome implementation?

I have a beacon with endpoint URL https://mobile.twitter.com/search?q=sbincubator and it is not showing on Chrome for Android.

Also, can anyone see why this URL would not show on Chrome for Android?

beaufortfrancois commented 8 years ago

I think this is because https://twitter.com/robots.txt prevents GoogleBot (used by Physical Web Service) to crawl some URLs such as /search?q=. @mmocny will know more for sure though

russellmorton commented 8 years ago

Thanks François. I tried with just https://www.twitter.com and it also does not work. Does this mean that we won't be able to broadcast any Twitter URLs?

On Tue, May 31, 2016 at 1:52 PM François Beaufort notifications@github.com wrote:

I think this is because https://twitter.com/robots.txt prevents GoogleBot (used by Physical Web Service) to crawl some URLs such as /search?q=. @mmocny https://github.com/mmocny will know more for sure though

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/google/physical-web/issues/671#issuecomment-222666345, or mute the thread https://github.com/notifications/unsubscribe/AG6lvExYm-MfiDcZDMVSiTqYsKRda-s5ks5qHCDogaJpZM4IqU2A .

Regards,

Russell M Morton

beaufortfrancois commented 8 years ago

You may want to try https://mobile.twitter.com/ ;)

russellmorton commented 8 years ago

Thanks, that works. I will wait for Michel to confirm, but it seems you are saying there is no way to get my original URL to work on Chrome for Android? Interestingly, it works on iOS.

On Tue, May 31, 2016 at 2:08 PM François Beaufort notifications@github.com wrote:

You may want to try https://mobile.twitter.com/ ;)

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/google/physical-web/issues/671#issuecomment-222669550, or mute the thread https://github.com/notifications/unsubscribe/AG6lvDhdRnFRJoTbxjUWUxugRC5Qi9Ljks5qHCTYgaJpZM4IqU2A .

Regards,

Russell M Morton

beaufortfrancois commented 8 years ago

It's because Chrome for iOS and Android implementations differ for now: https://github.com/google/physical-web/blob/master/implementation-status.md#chrome It will be the same soon though.

russellmorton commented 8 years ago

@mmocny any further information on how I could get the link in my original post to work?

mmocny commented 8 years ago

@russellmorton As @beaufortfrancois says, we respect the website robots.txt file and do not crawl it.

However, looking at the twitter robots file, it looks like they have a way for google to crawl which we are not taking advantage of. I will investigate if this is not something we can fix server-side, and I can update you when I know more. (Apologies if this takes some number of days.)

As per the iOS question -- also as @beaufortfrancois says, Chrome for iOS is already updated to provide the same results that Chrome for Android does, but it hasn't shipped to the stable version just yet. You can expect that in version m52.

Additionally, we have a new team member working on updating the OSS sample apps (those within this repo) to also be more in line with the Google product integrations, so as to not be so confusing. Today we have a lot of variability in our scanners because of history and experimental variations, and its time to clean up!

Thanks for bringing up the issue.

(Also, sneak peak, I am trying to help @beaufortfrancois build a diagnostic tool to help give you insight into why URLs are failing to resolve. Will share more when that is ready, but hopefully you won't need a github issue for every question.)

Cheers.

russellmorton commented 8 years ago

Thanks Michel, I will wait for your feedback on this Twitter robots.txt query.

ferencbrachmann commented 8 years ago

Guys, we're developing our own forwarder and have experienced an issue. Here's a URL to test the issue:

https://beeem.co/em00004

The really odd part is that the resolved URL (https://beeem.co/p/GB/Bushey/WatfordBathroomandKitchens/watfordbath) shows up in Chrome for Android (see attached screenshots) but not on the nearby list.

2016-07-21 08 27 55 2016-07-21 08 28 24

anirudhcmohan commented 8 years ago

It looks like the shortened URL that you've linked to doesn't resolve in the browser either (I see this when I go to the URL directly in Chrome). Are you experiencing this too?

nondebug commented 8 years ago

I think you meant https://beeem.co/em000004 (added a zero), which redirects as expected.

I configured a beacon with the URL and was able to see it in Chrome for Android and the Today view widget in Chrome for iOS, but not in Nearby. Other URLs were showing in Chrome but not in Nearby (including https://www.google.com) so I think this is a Nearby issue and not related to your URL.

On Fri, Jul 22, 2016 at 11:21 AM, anirudhcmohan notifications@github.com wrote:

It looks like the shortened URL that you've linked to doesn't resolve in the browser either (I see this when I go to the URL directly in Chrome). Are you experiencing this too?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/google/physical-web/issues/671#issuecomment-234618336, or mute the thread https://github.com/notifications/unsubscribe-auth/ANckc_lO2pJzQPww3eVx105d_Kk9_Okkks5qYQpFgaJpZM4IqU2A .

nondebug commented 8 years ago

It looks like the metadata service isn't finding a description for your page and is omitting the field from the returned JSON, which might be what's tripping up Nearby. As a workaround, could you try adding a description to your page?

Example:

ferencbrachmann commented 8 years ago

Thanks! Will do! It's a bit complicated as this is a CMS designed for the physical web, the page is rendered by our webapp. We'll change the code and let you know.

ferencbrachmann commented 8 years ago

I think some of the confusion stems from the fact that if Nearby takes over beacon discovery some beacon URLs working in the "past" get filtered out in Nearby.

scottjenson commented 8 years ago

There appears to be a bug in Nearby where websites without a description are filtered out. This is clearly a bug there there is an easy workaround.

On Tue, Jul 26, 2016 at 10:47 AM, Ferenc Brachmann <notifications@github.com

wrote:

I think some of the confusion stems from the fact that if Nearby takes over beacon discovery some beacon URLs working in the "past" get filtered out in Nearby.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/google/physical-web/issues/671#issuecomment-235348387, or mute the thread https://github.com/notifications/unsubscribe-auth/ABAbuq8iPgfdIQ1vkZdBcgohgmekTVP0ks5qZkhCgaJpZM4IqU2A .

scottjenson commented 8 years ago

you can now check URLs against the PWS by going here: verify.physical-web.org

jugaltheshah commented 8 years ago

Awesome!! Any chance of a Chrome extension or something, so sites behind a firewall can be checked? Would that even make sense, or can only public-facing sites be broadcast? That would be unfortunate as it cuts off a broad swath of potential use cases..

scottjenson commented 8 years ago

The whole point of the Physical Web Service is to cache and vet websites we're showing to users. We are, structurally, just like google.com. Your request is much like asking of Google.com can show sites behind a firewall. I get why you'd like to do this of course.

If you are willing to make people wait just a bit, you could take them to a public landing page and then use a JS redirect into your page. Not perfect but you could try things out.

jugaltheshah commented 8 years ago

Fair enough. I'll have to experiment, but won't the url verifier try to follow the redirect and then 404? Is there a timeout value that would be safe?

scottjenson commented 8 years ago

Unfortunately, there is no safe value, this is a bit of a hack you can use to try something out. If it isn't too much trouble, I'd just put up a public landing page and have people click through. I realize this isn't perfect but you could get that working 100% right now.