tosdr / tosdr.org

ARCHIVED Source code for tosdr.org
https://github.com/tosdr/CrispCMS
GNU Affero General Public License v3.0
455 stars 26 forks source link

API discovery and multiple domains for a website (title edited) #21

Closed Skalman closed 12 years ago

Skalman commented 12 years ago

Google has lots of websites (google.com, google.de, google.se and so on), and I think it'd be useful to only show them once. The API should thus allow for multiple domains per site.

hugoroy commented 12 years ago

Yes. this is something we need also for the extension. For now, Google seems like the one who really needs it. Do you see other examples?

hugoroy commented 12 years ago

what about https://github.com/unhosted/ToS-DR/commit/24834926a2eea26e2d24dd061f34af4d73872440 ?

hugoroy commented 12 years ago

@Skalman :

Huh... Now that I look at it the API isn't good for discovery. If I know the domain is "facebook.com", why is it that it's not included in the API request? I think the URL should be /services/facebook.com.json. If it will ever evolve, you should include a version too, like /services/0.1/facebook.com.json. I mean, in the future you might want to provide more/other info too... Of course the disadvantage of having versioning is that you have to maintain multiple APIs.

So what I'm saying is that each resource should probably have its own data. Internally it could be stored in another way to avoid duplication.

I guess it more or less the same problem.

@darkpicnic pointed out that a same domain can have multiple ToS related, etc. So it's not ideal either. Here's what he proposed:

I recommend modeling your API after the CrunchBase API, where your method for acquiring a product goes:

1) Search for a product or company 2) Receive a list of products or companies that match search 3) Determine which product is the one you desire 4) Retain product id for later retrieval

hugoroy commented 12 years ago

My comment is: shouldn't this be dealt with separately, I mean: in the API, but not in the JSON file (where we could stick to the simple "id" and "url" couple?)

outergod commented 12 years ago

I just wanted to raise ticket when I found this (presumably) identical issue.
The current Firefox and Chromium implementations use a broken regexp for each domain (it's not really a URL you store, here!): 'https?://[^:]*' + service.url + '.*' E.g. in case of facebook.com, this matcher would also yield a spurious match for my-blog-about-facebook.com. The only "real" solution is obviously an additional storage with a 1:n relation for services and the domains (top plus second level) they host and serve users with, just as OP suggests. Matching full domains (incl. sub-domain) would then be performed against 'https?://(?:[^:]\.)?' + item + '(?:/.*)?' for each item over service.domains. Please give me your feedback.

hugoroy commented 12 years ago

anyone @michielbdejong @shybyte @AbdullahDiaa ?

outergod commented 12 years ago

Can we somehow move this issue to browser-extensions? Also, I'd really like feedback @michielbdejong @shybyte @AbdullahDiaa :>

michielbdejong commented 12 years ago

hi! i created http://tos-dr.info/ratings.json which gives exact domain names. If we use this in all extensions, then the extensions don't expose browsing behaviour so that would be an added bonus. i'll create an issue on the new repo, so then this one can be closed i guess

On Tue, Oct 2, 2012 at 10:22 AM, Alexander Kahl notifications@github.comwrote:

Can we somehow move this issue to browser-extensions? Also, I'd really like feedback @michielbdejong https://github.com/michielbdejong @shybytehttps://github.com/shybyte @AbdullahDiaa https://github.com/AbdullahDiaa :>

— Reply to this email directly or view it on GitHubhttps://github.com/didnotread/didnotread.org/issues/21#issuecomment-9062300.

michielbdejong commented 12 years ago

see https://github.com/didnotread/browser-extensions/issues/3

outergod commented 12 years ago

ratings.json, in this form, does not solve the issue. As I said already, a service can use n domains where one is not automatically a subdomain of the other.

xMartin commented 12 years ago

Weren't you the one who asked for moving the issue? :) We did.

outergod commented 12 years ago

Ha, sorry, you're right :) Will comment on the new one.