Non-standard well-known location processing

JeniT commented 9 years ago

From @chaals, re http://www.w3.org/TR/2015/WD-tabular-data-model-20150416/#standard-file-metadata

In section 5.4 and 5.5 there are requirements for dealing with information in well-known locations. But there is no mention of the standard approach of using /.well-known/ as specified in RFC 5785. Why not?

JeniT commented 9 years ago

We have asked for the TAG's opinion on precisely this issue, see https://lists.w3.org/Archives/Public/www-tag/2015Apr/0028.html because we know it's less than ideal.

.well-known is good for locating site-wide metadata. We could not see a good way to use .well-known as specified in RFC 5785 for file- or directory-specific metadata. These methods are provided for publisher convenience in the case where they do not have access to be able to set the Link header; I imagine in this case (eg when publishing through Github) that the publisher won't have access to the .well-known directory either.

Have we missed something? If not, is this sufficient to address the question or do we need to add a note to this effect to the spec?

chaals commented 9 years ago

Hmm. I think there are plenty of cases where people have access to /.well-known/ but not to setting headers.

I agree that figuring out how to use .well-known properly in this case is non-trivial.

Let's see what the TAG (which includes @mnot - one of the people behind .well-known) says, and I'll commit in advance to be happy with that - which means that from my perspective this can be closed.

mnot commented 9 years ago

RFC5785 is more about defining a mechanism for locating metadata for an origin or hostname, rather than for a specific resource.

That said, reading the section linked above, it seems like this is proposing a significant violation of BCP190/RFC7320 http://tools.ietf.org/html/rfc7320:

Processors must locate and retrieve the common metadata document for a directory by resolving the relative URL metadata.json against the base URL of the tabular data file and fetching the resulting URL.

mnot commented 9 years ago

You could, BTW, define a well-known resource for finding resource-specific metadata, btw.

E.g., to find metadata for /foo/bar.json, you could say to look in /.well-known/widget-metadata/foo/bar.json (assuming you register 'widget-metadata').

iherman commented 9 years ago

@mnot,

That said, reading the section linked above, it seems like this is proposing a significant violation of BCP190/RFC7320 http://tools.ietf.org/html/rfc7320:

Processors must locate and retrieve the common metadata document for a directory by resolving the relative URL metadata.json against the base URL of the tabular data file and fetching the resulting URL.

Could you elaborate, for the records, which aspect of rfc7320 you think this approach is violating?

Thanks

iherman commented 9 years ago

As @JeniT already said, the problem with this:

E.g., to find metadata for /foo/bar.json, you could say to look in /.well-known/widget-metadata/foo/bar.json (assuming you register 'widget-metadata').

is that, typically, publishers of a CSV file has no control over the overall site where they publish, in particular, they cannot register a .well-known. We could add this option to the current approaches, as yet another way of locating a metadata file, but I do not think this would have any significant uptake. But it should definitely not invalidate what we have now imho.

mnot commented 9 years ago

Ivan,

Read the RFC -- it's pretty self-explanatory. What's proposed here violates 2.3.

iherman commented 9 years ago

@mnot, the text you refer to says:

For example, an application ought not specify a fixed URI path "/myapp", since this usurps the host's control of that space.

Specifying a fixed path relative to another (e.g., {whatever}/myapp) is also bad practice (even if "whatever" is discovered as suggested in Section 3); while doing so might prevent collisions, it does not avoid the potential for operational difficulties (for example, an implementation that prefers to use query processing instead, because of implementation constraints).

So I think the point is: what we do is "bad practice" rather than violation. Indeed, the metadata.json file is not an absolute path on the publisher's site, but it is relative to the original CSV file's URI. Ie, it is the second paragraph above that applies, not the first.

To be honest, I do not see any better solution here for the CSV case. As said before, the usage of .well-known, though possible to add to the system, would not work as a replacement.

gkellogg commented 9 years ago

Indeed, an example where it is not feasible to either use a link header or .well-known, is the project DOAP file I publish at https://raw.githubusercontent.com/ruby-rdf/rdf-tabular/develop/etc/doap.csv, with its companion metadata at https://raw.githubusercontent.com/ruby-rdf/rdf-tabular/develop/etc/doap.csv-metadata.json. GitHub provides no mechanism for assuring metadata location in any other way.

But, we should probably note that we are violating these recommendations along with our rationale someplace in the locating metadata section.

mnot commented 9 years ago

Gregg, Github does provide such a facility, using Github Pages (which allows you to publish at any path on a project's site).

That said, I never got the note that we'd decided to allow Github to constrain what we do with the Web...

Ivan, can you give some criteria you're using for "better"?

danbri commented 9 years ago

Following up discussion on WG call w/ @JeniT and @gkellogg, bookmarking an idea:

What if there was a packaging mechanism for sticking the metadata defined by this WG within a human-friendly HTML page? Thus making it more likely that the URL for the metadata (inside that page) would be shared, linked, used etc. When the metadata is a non-human-useful JSON-LD URL we expect that the raw CSV URL will be the default one that ends up in circulation. The benefit being removing pressure to have a metadata discovery protocol. The cost being a dependency on some kind of HTML(5 etc) parsing.

I made a mock of JSON-LD within script element in HTML: http://danbri.github.io/csvw-template/example2-csv.html

It has most of the metadata in the JSON-LD but also a tiny fragment in HTML RDFa wrapped around the actual link to the CSV.

For reference, here is another version with the whole thing in RDFa: http://danbri.github.io/csvw-template/example.csv.html

Also as an aside, Google is already shipping various search features based on tooling that can extract this kind of markup: https://developers.google.com/structured-data/testing-tool/?url=http://danbri.github.io/csvw-template/example2-csv.html so this isn't a purely theoretical proposal.

iherman commented 9 years ago

I am not sure how this would solve the problem. Do you mean that, instead of a metadata.json file the metadata should be part of an HTML file? But wouldn't we run into the same issue as for the URL of that HTML file?

Maybe what I do not get is your reference to a 'packaging mechanism'. I think we voted down the approach of using packaging in the spec a long time ago…

Ivan

On 20 May 2015, at 16:49 , Dan Brickley notifications@github.com wrote:

Following up discussion on WG call w/ @JeniT and @gkellogg, bookmarking an idea:

What if there was a packaging mechanism for sticking the metadata defined by this WG within a human-friendly HTML page? Thus making it more likely that the URL for the metadata (inside that page) would be shared, linked, used etc. When the metadata is a non-human-useful JSON-LD URL we expect that the raw CSV URL will be the default one that ends up in circulation.

I made a mock of JSON-LD within

— Reply to this email directly or view it on GitHub.

Ivan Herman, W3C Digital Publishing Activity Lead Home: http://www.w3.org/People/Ivan/ mobile: +31-641044153 ORCID ID: http://orcid.org/0000-0003-0782-2704

gkellogg commented 9 years ago

The issue seems to be that publishers are not likely to reference a metadata file directly, and would continue to publish the data just as CSV, making metadata discovery more difficult. The assertion is that they might prefer to publish the dataset using an HTML, which is a more normal workflow. If discovery starts there, for processors able to extract information from HTML, a script tag with an appropriate @type attribute, could contain the metadata with references to the CSV. In general, you would expect that the HTML might textually describe the dataset.

I think a tool to automate creating such a description from a JSON manifest, perhaps using some HTML templating, would be straight forward, and might be useful.

This could be a way of side-stepping any issues with metadata discovery via "URL squatting" by providing an alternative mechanism that may be more mutually satisfactory.

iherman commented 9 years ago

I am still lost. While I see the value of the metadata embedded in HTML, that is a good starting point when processing begins with the metadata. However, this issue is about the 'metadata.json' file which, mostly, comes into tge picture when processing begins by the CSV file.

danbri commented 9 years ago

Yes, as @gkellogg says, if the metadata is safely packaged (or discoverable via link) inside the kind of human-friendly pages we know and love, then its URL is likely to be hyper-linked, shared, etc. much more readily. If the metadata URL is a page of computers-only gobbledegook, nobody's going to link it from their blog, post it to Facebook/Twitter/Friendster, bookmark it or whatever. So the argument is that people will use some URL to share the whole package of CSV + supporting metadata. Either they'll share or link to the CSV itself (which gets us into familiar URL-editing discovery scenarios), or they'll share a wrapper page or e.g. Apache default directory view.

iherman commented 9 years ago

I think there is a split in the issues. I acknowledge the issue you refer to below (although it may be a bit late to pick that up for now, but that is a different discussion). However, the original issue was how we define the 'fixed' URI-s that are used to access the metadata when we start at a given CSV file. We opted for one solution, including a fixed URI pattern ('metadata.json'), and this choice is commented by Mark. From that point of view, whether we talk about 'metadata.json' or 'metadata.html' does not seem to be different.

Ie: please, let us split the issue in case we have a separate issue; with the current priorities I am eager closing the issue raised by Mark.

Ivan

On 22 May 2015, at 11:22 , Dan Brickley notifications@github.com wrote:

Yes, as @gkellogg says, if the metadata is safely packaged (or discoverable via link) inside the kind of human-friendly pages we know and love, then its URL is likely to be hyper-linked, shared, etc. much more readily. If the metadata URL is a page of computers-only gobbledegook, nobody's going to link it from their blog, post it to Facebook/Twitter/Friendster, bookmark it or whatever. So the argument is that people will use some URL to share the whole package of CSV + supporting metadata. Either they'll share or link to the CSV itself (which gets us into familiar URL-editing discovery scenarios), or they'll share a wrapper page or e.g. Apache default directory view.

danbri commented 9 years ago

@JeniT asked me to comment in #555 during our last call. If you'd prefer a separate issue, feel free. Having a wrapper HTML page (even if it only rel=meta linked to the metadata.json) could significantly reduce the need for a "start from the CSV" option, perhaps to the level that would make existing discovery mechanisms like sitemaps (http://www.sitemaps.org/) viable.

gkellogg commented 9 years ago

Note that if an HTML document were retrieved with both describes and describedby Link headers, referencing the CSV(s) and Metadata files respectively, and application wouldn't necessarily need to parse the HTML.

As a possible best practice:

Set Link headers as described above.
Add reference to metadata using script element and @rel=describedby @type=application/csvw+json
Add reference to CSVs using @rel=describes @type=text/csv @src=<location> or equivalent @itemprop and @src
Optionally include the metadata within the body of the script element.
Optionally reflect the content of the CSVs using HTML table elements with @itemid or @resource referencing the CSV files.

iherman commented 9 years ago

I am not sure where we want to go with this. Is the goal modifying the standard text and replace the current mechanism of accessing the metadata? Is this an informal note in the text for a possible best practice?

I am against the former; we should not mess around with this at this point. (I am playing my role: we planned to issue a final LC+CR at the beginning of June and we seem not to be able to hold that deadline. Let us not add new things...)

I am fine with the latter, but I would prefer to discuss this when we got all the issues closed.

Ivan

Gregg Kellogg wrote:

Note that if an HTML document were retrieved with both |describes| and |describedby| Link headers, referencing the CSV(s) and Metadata files respectively, and application wouldn't necessarily need to parse the HTML.

As a possible best practice:

Set Link headers as described above. * Add reference to metadata using |script| element and |@rel=describedby @type=application/csvw+json| * Add reference to CSVs using |@rel=describes @type=text/csv @src=| or equivalent |@itemprop| and |@src| * Optionally include the metadata within the body of the |script| element. * Optionally reflect the content of the CSVs using HTML |table| elements with |@itemid| or |@resource| referencing the CSV files.

— Reply to this email directly or view it on GitHub https://github.com/w3c/csvw/issues/555#issuecomment-105660339.

gkellogg commented 9 years ago

I'm simply exploring the space of alternative locations. It seems that we may have a way around the problem of using our existing metadata location, by presupposing the existence of suitable URI patterns in .well-known someplace, but this is why we explored an alternative. The idea here is that, even if we were to do something with HTML, it would not necessarily impose the use of an HTML/XML toolchain on processors. But, as you say, we may want something like this as a Note or something else, to give guidance to people publishing CSV using HTML.

JeniT commented 9 years ago

We discussed this on the TAG call today, and got agreement to pursue the design suggested in https://lists.w3.org/Archives/Public/www-tag/2015May/0029.html, namely that we define a .well-known location containing what look like Link header values, but also say that there is some default content for such a file if it's not located.

@mnot was happy with this as a way forward provided that there were some early popular deployments of CSV using metadata where that metadata was not in the default locations, to avoid falling into implementations not looking in the .well-known location. We can obviously do this within our own tests and through wider example materials, but also have to nobble people like @psd who might be early publishers of CSV data with metadata.

yakovsh commented 9 years ago

@JeniT we would need to define scope for this - whether this is specific to CSVW or a generic mechanism for locating files. If it is the later, then I am afraid it maybe something that will need to be handled by either IETF or TAG.

iherman commented 9 years ago

@yakovsh, I am really concerned about a potential delay. I think we should define that for CSVW for now; if the community wants to pick up this issue at some point and push it further, that should be fine. A next version of CSVW would then look at that evolution and adapt to it (in case any change will have been made).

/Cc @JeniT

iherman commented 9 years ago

On 03 Jun 2015, at 22:31 , Jeni Tennison notifications@github.com wrote:

We discussed this on the TAG call today, and got agreement to pursue the design suggested in https://lists.w3.org/Archives/Public/www-tag/2015May/0029.html, namely that we define a .well-known location containing what look like Link header values, but also say that there is some default content for such a file if it's not located.

@mnot was happy with this as a way forward provided that there were some early popular deployments of CSV using metadata where that metadata was not in the default locations, to avoid falling into implementations not looking in the .well-known location. We can obviously do this within our own tests and through wider example materials, but also have to nobble people like @psd who might be early publishers of CSV data with metadata.

@JeniT, for the records, I am happy with this, although the minute details are still a bit foggy (I presume we have to have a formal specification of that .well-known text file, although it may be as simple as saying that the {path} string must be expanded to the original file's path, and leave it at that.)

iherman commented 9 years ago

The issue seems to be that publishers are not likely to reference a metadata file directly, and would continue to publish the data just as CSV, making metadata discovery more difficult. The assertion is that they might prefer to publish the dataset using an HTML, which is a more normal workflow. If discovery starts there, for processors able to extract information from HTML, a script tag with an appropriate @type attribute, could contain the metadata with references to the CSV. In general, you would expect that the HTML might textually describe the dataset.

I think a tool to automate creating such a description from a JSON manifest, perhaps using some HTML templating, would be straight forward, and might be useful.

In view of Jeni's discussion with the TAG the original issue may lead to a resolution soon. @gkellogg @danbri do you think it is worth extracting the script-in-HTML approach and document it as a separate issue flagged for a next release? I think it is a valuable idea that we do not want to loose, even if it may be too late to include in the current version.

gkellogg commented 9 years ago

So, I'm unclear on the mechanics here. It seems we would need to register a .well- known location with a new file format, say /.well-known/csvw. For this we would define a file format, which would seem to be a set of link-relations where the URL component is a URI template using the "path" variable. A CSVW processor, when attempting to locate metadata files would find entries in this file of rel "describedBy" and type "application/csvw+json" and look for files related to a particular csv path by substituting that for the "path" in the URI template.

That suggests that either there are other relations we might discover, other types which might be returned, or that this is a much more generally useful mechanism, and shouldn't be csvw specific.

Doing something csvw specific would seem to be too narrow a case for something like this, but going through a separate RFC process might be too involved.

mnot commented 9 years ago

application/link-format might be helpful: http://tools.ietf.org/html/rfc6690

6a6d74 commented 9 years ago

@gkellogg - I think your summary of the solution is accurate.

For example, /.well-known/csvw might specify:

<{path}-metadata.json>; rel="describedBy"; type="application/csvm+json”
<{path}/../metadata.json>; rel="describedBy"; type="application/csvm+json"

we would define a file format

I think the proposal is to use the format of the value of the Link HTTP header.

If the /.well-known/csvw file is not present, then we have some default behaviour.

So presumably this means we will need to register with the [IANA well-known-uri registry]?

Looking at the discussion thread, I can see how this would be generally applicable. But for the sake of expedience I am +1 to @iherman's comment noting concerns about delays. Let's just focus on a CSVW solution and retrofit later if need be.

gkellogg commented 9 years ago

Link to TAG minutes: https://github.com/w3ctag/meetings/blob/gh-pages/2015/telcons/06-03-csv-minutes.md

iherman commented 9 years ago

Link to a new discussion thread of the TAG, initiated by David Booth: http://www.w3.org/mid/558318BB.8020701@dbooth.org

iherman commented 9 years ago

Regardless of the outcome of the new round of discussion, this comment may be considered as separate:

The standard directory metadata filename is currently specified in section 5.4 as "metadata.json" http://w3c.github.io/csvw/syntax/#standard-directory-metadata This is a pretty generic name that someone may want to use for other purposes. To reduce the potential for a collision, how about changing this to something that is more CSVW specific, such as "csv-metadata.json"?

A google search for "metadata.json" shows 55,800 hits, while a search for "csv-metadata.json" shows only 22 hits, and all but maybe one of those pertain to this CSVW work.

See http://www.w3.org/mid/5584A341.8020206@dbooth.org

dbooth-boston commented 9 years ago

BTW, I think I have pretty well shown that .well-known is not needed (to avoid harmful URI squatting). But even though it is not needed, some still may want to use it in order to specify non-standard filenames for their metadata. (Of course, using non-standard filenames may create chaos and headache for whoever comes later to maintain your data, but I suppose that's their problem.) My question is: does anyone have any estimate of how many people would actually use .well-known to specify non-standard filenames for their metadata? Any guesses?

iherman commented 9 years ago

On 29 Jun 2015, at 05:24 , David Booth notifications@github.com wrote:

BTW, I think I have pretty well shown that .well-known is not needed (to avoid harmful URI squatting). But even though it is not needed, some still may want to use it in order to specify non-standard filenames for their metadata. (Of course, using non-standard filenames may create chaos and headache for whoever comes later to maintain your data, but I suppose that's their problem.) My question is: does anyone have any estimate of how many people would actually use .well-known to specify non-standard filenames for their metadata? Any guesses?

My belief is that it is not the matter of how many people in terms of end-users, but how many sites that publish a larger amount of CSV files. (E.g., a site that publishes scientific papers and want to provide a place where authors can upload data files and as well as metadata files.) I think that the .well-known approach lives a flexibility to system people of such sites to decide where the metadata files are with respect to the data files itself, e.g., if they want to put the data files to a separate machine internally (and they would redirect or proxy the CSV URL-s).

It is this flexibility that makes .well-known attractive to me, regardless on whether the lack of it is architecturally problematic or not (on which, I admit, I do not have a strong opinion).

iherman commented 9 years ago

Looking a bit back to the history of this thread: it is clear that opinions are split. On the other hand, we MUST close this issue because we have to move on (and this is probably the most serious open issue, although the editorial work and the corresponding pull requests are ready to go). I must admit I am anxious publishing a LCCR as soon as possible.

The difficulties arising, and raised by @dbooth-boston, are really implementation related. Hence, I would propose the following:

We issue the LCCR with the .well-known mechanism in place, with all bells and whistles (test cases, including installing a default version on the W3C site)
We call out explicitly in the document that this is a particular issue we want implementors' feedback on, describing the alternative (i.e., the two hard-wired URL-s)

Formally, we declare this features "at risk", as allowed in the process document, meaning that if the implementation experience demands us to remove the feature, we can do so without jeopardizing the advancement to Proposed Recommendation. We could then close this issue for the LCCR transition (and open a separate on for the PR call as a yes/no answer request from the CR feedbacks).

@dbooth-boston, is that acceptable for you? @JeniT, @gkellogg, would that be o.k.?

Thanks

6a6d74 commented 9 years ago

Ivan- your proposal seems sensible to me. Thanks, Jeremy

On Mon, 29 Jun 2015 at 08:40 Ivan Herman notifications@github.com wrote:

Looking a bit back to the history of this thread: it is clear that opinions are split. On the other hand, we MUST close this issue because we have to move on (and this is probably the most serious open issue, although the editorial work and the corresponding pull requests are ready to go). I must admit I am anxious publishing a LCCR as soon as possible.

The difficulties arising, and raised by @dbooth-boston https://github.com/dbooth-boston, are really implementation related. Hence, I would propose the following:

We issue the LCCR with the .well-known mechanism in place, with all bells and whistles (test cases, including installing a default version on the W3C site)

We call out explicitly in the document that this is a particular issue we want implementors' feedback on, describing the alternative (i.e., the two hard-wired URL-s)

Formally, we declare this features "at risk", as allowed in the process document http://www.w3.org/2014/Process-20140801/#candidate-rec, meaning that if the implementation experience demands us to remove the feature, we can do so without jeopardizing the advancement to Proposed Recommendation. We could then close this issue for the LCCR transition (and open a separate on for the PR call as a yes/no answer request from the CR feedbacks).

@dbooth-boston https://github.com/dbooth-boston, is that acceptable for you? @JeniT https://github.com/JeniT, @gkellogg https://github.com/gkellogg, would that be o.k.?

Thanks

— Reply to this email directly or view it on GitHub https://github.com/w3c/csvw/issues/555#issuecomment-116503501.

gkellogg commented 9 years ago

+1 I'll add an at- risk note, and the missing text about default caching of negative (404) responses to the PR.

dbooth-boston commented 9 years ago

@iherman I think that's a good idea, but it needs to be phrased the other way around: If implementation experience does _not_ demonstrate significant use of this feature, then it will be dropped in advancement to PR.

iherman commented 9 years ago

@iherman I think that's a good idea, but it needs to be phrased the other way around: If implementation experience does not demonstrate significant use of this feature, then it will be dropped in advancement to PR.

@dbooth-boston, that is unrealistic and slightly biased. The CR phase does not measure deployment, it measures implementations, primarily implementation feasibility. There is no way significant use of a specific feature (or lack thereof) can be measured at that time.

dbooth-boston commented 9 years ago

With all due respect, I think it is the other way around: it would be biased to phrase the "at risk" statement the way it was originally proposed! The .well-known feature was added only as an afterthought, specifically to address a web architectural concern about harmful URI squatting. That concern has now been disproved. Therefore, this feature _does not belong in the spec_ unless there is some compelling new evidence to justify its inclusion.

I would be willing to add this feature _if_ someone can make a compelling argument that there really will be a significant number of people/sites who: (a) do _not_ want to publicize their CSV metadata URLs; (b) want to give their metadata non-standard filenames; and (c) would actually use the .well-known feature to do so. So far, I have not heard anyone willing to make that argument or provide any evidence to support it.

We should not be adding such an obscure feature just because someone thinks there might be sites that would want it. There is nothing in the Use Cases and Requirements document that requires this feature, and the whole purpose of the Use Cases and Requirements document is to distinguish possibly desired features from actually needed features. If someone wants this feature added, I think the burden falls on them to make a compelling argument that it really would have significant usage.

gkellogg commented 9 years ago

We asked the TAG for advice on this issue and received it. @dbooth-boston asked for them to re-consider, which did not result in a change of recommendation. Indeed, the facility seems to be quite useful, but adds an additional request.

Moreover, the use of .well-known, and the lookups it requires are now part of the architecture of the web and a best practice, so I don't see any reason to avoid such specific advice in this case. Accordingly, I added the following at-risk statement:

The use of a well-known location for defining URI patters used to locate metdata files is at risk. The working group solicits feedback on if this mechanism is useful.

I did suggest in the PR that this may be further word-smithed, but I believe that this facility represents the result of this process and does not place an undo burden on implementations or hosting providers and is in line with likely greater use of .well-known patterns going forward. The at-risk allows us to remove it if enough feedback comes in show that harm could be done by including it, but at this point, for my part, I think it belongs in the spec.

To say that it was an after-thought could be applied to all advice received during the review period, including the TAG and I18N groups, so I don't think it's fair to say that it's any less well considered because of that.

The purpose of the TAG (as I understand it) is to take up such over-arching web architecture concerns and issue findings that have no official weight, but should be given due consideration, to the point that a very good reason to not follow their advice must be taken. Their advice probably doesn't rise to the level of a finding, but I believe it's reasonable and personally support keeping it in, but am willing to reconsider after due feedback.

I'm also not aware of a mechanism where a group can reserve the right to add a feature using an at-risk; I believe it's intended to indicate that this is a feature that may be removed.

We'll certainly discuss this further before LCCR.

dbooth-boston commented 9 years ago

We asked the TAG for advice on this issue and received it. @dbooth-boston asked for them to re-consider, which did not result in a change of recommendation.

Indeed, the TAG has not yet discussed the issue since I pointed out that the CSV standard metadata URI mechanism do not in fact cause harmful URI squatting as the TAG had presumed when it previously discussed the issue, so it could not have produced a change in recommendation.

Indeed, the facility seems to be quite useful, but adds an additional request.

Please indicate what sites you think would actually use it if the feature were included.

Moreover, the use of .well-known, and the lookups it requires are now part of the architecture of the web and a best practice, so I don't see any reason to avoid such specific advice in this case.

The use of .well-known is a best practice _to avoid harmful URI squatting_. It emphatically is _not_ a best practice where harmful URI squatting would not result and other mechanisms that are available to a broader spectrum of data publishers can be used.

To say that it was an after-thought could be applied to all advice received during the review period, including the TAG and I18N groups, so I don't think it's fair to say that it's any less well considered because of that.

The point is that it was added specifically to address a concern that has since been shown to have been based on an erroneous assumption. The assumption at the time was that the use of standard URIs for metadata would cause harmful URI squatting. But a more detailed analysis has revealed that that is not the case. In other words, the rationale for adding that feature has evaporated. If you want to replace it with a new rationale you may, but the burden is then on you to demonstrate that this feature really would benefit a significant number of users. And I have not yet seen any evidence of that.

The purpose of the TAG (as I understand it) is to take up such over-arching web architecture concerns and issue findings that have no official weight, but should be given due consideration, to the point that a very good reason to not follow their advice must be taken.

Agreed. And a very good reason to not follow their advice would be if that advice were given in error. And that is precisely what happened. The assumption that harmful URI squatting would result if .well-known were not used was _incorrect_.

I'm also not aware of a mechanism where a group can reserve the right to add a feature using an at-risk; I believe it's intended to indicate that this is a feature that may be removed.

It would be okay with me to include the .well-known feature in LCCR along with an "at risk" note saying that the feature is at risk and will be removed unless there is evidence that it would have significant usage.

iherman commented 9 years ago

@gkellogg I have added a minor comment to the latest commit for a slightly modified text in the 'at risk' note:

The Working Group solicits feedback on whether this mechanism is useful and whether it represents major implementation difficulties.

iherman commented 9 years ago

@dbooth-boston I have not much to add to the comment of Gregg; I agree with it. The TAG has given its advice and I have not seen any instruction/reaction from the TAG on coming back on their decision. Consequently, we cannot simply remove the feature.

Adding an "at-risk" notice is the only thing we can and, in my view, should do. I have proposed a slight amendment on the text in @gkellogg's PR; I am actually opposed to the text you propose ("unless there is evidence that it would have significant usage") for the reasons I have already stated: it is an unrealistic and unfulfillable request at a CR phase.

We will, I presume, have a telco tomorrow (Wednesday). The WG should vote and make a decision on the way forward; at this point, we should not keep the LCCR suspended on this issue any more.

dbooth-boston commented 9 years ago

The Working Group solicits feedback on whether this mechanism is useful and whether it represents major implementation difficulties.

Please record me as STRONGLY OPPOSED to the above proposed wording.

gkellogg commented 9 years ago

Reopening for discussion, although PR is merged.

iherman commented 9 years ago

Discussed at: http://www.w3.org/2015/07/01-csvw-irc#T14-25-40

dbooth-boston commented 9 years ago

Poll on the use of .well-known to specify non-standard CSV metadata filenames: https://lists.w3.org/Archives/Public/public-csv-wg/2015Jun/0085.html

dbooth-boston commented 9 years ago

Bad news: this use of .well-known is even worse than I previously realized. https://lists.w3.org/Archives/Public/www-tag/2015Jul/0007.html

gkellogg commented 9 years ago

Reopening, as this is an at-risk issue referenced from the data model document.

iherman commented 9 years ago

Closing this issue; it is now documented as the at risk feature in a separate issue: #691

w3c / csvw

Non-standard well-known location processing #555