Closed beaufortfrancois closed 5 years ago
toJSON rethrows exceptions. It will call utf+8 decode on bytes and then call json parse
Btw the mime type of json is not "json" or "text/json"
@kenchris I've edited my post with your feedback. Thank you!
It is not really doing more, though it is not doing that using JS calls. It is the same as saying that toText() is just calling toArrayBuffer and then using the TextDecoder... (Though it does a bit more when it's a text record)
There's some precedent here with push bodies.
If you're going to add this, the naming should align with push (and fetch). This is especially important as toJSON
already has meaning on the platform.
The benefit of response.json()
is the platform can parse it in a streaming manner. That isn't really true here or push, since the amount of data is really small & synchronously available.
If it's common to get the data as text, then a helper method seems reasonable (and that should be named .text()
). Then I guess you may as well add the rest for consistency. Although I don't think I'd be arguing for that if push hadn't set precedent.
It is already implemented. I think it offers a lot of convenience for normal users. Not everyone understand how json and text is encoded and know how to use a TextDecoder etc. And with our API it is very easy to write JSON, so it should definitely be just as easy to read it back
Wait, toJSON()
got through TAG review?
https://w3c.github.io/web-nfc/#dfn-convert-ndefrecord-payloaddata-bytes seems really inconsistent with the web platform.
Firstly with naming. Other APIs such as fetch and push use text()
and json()
rather than toText()
and toJSON()
. This was done deliberately since toJSON()
is kinda like toString()
in that it's used as a way to 'cast' an object to a particular type. This means if an NDEFRecord
object is processed as part of JSON.stringify()
, it'll call the toJSON()
methods. This is weird since the return value of NDEFRecord
's toJSON()
method doesn't return a JSON-compatible representation of itself, which is the intent of the toJSON()
method elsewhere on the platform.
Additionally, the conversion reacts to the recordType
. If I ask for a record to be parsed as JSON, but the type isn't 'json' or 'opaque', it returns null. This is weird because I've asked for the record to be read as JSON. If it can't be read as JSON, it should throw.
If the developer has indicated something should be interpreted as a type, we don't generally 'succeed' if the other end has declared itself to be a different type. Eg, if I use an <img>
to load something that's not an image, it'll fire the error event, even if its content-type indicates it isn't an image (text/javascript
for instance).
In some cases we fail if the content declares itself to be another type (eg, we check the content-type for module scripts and css), but will still error.
In fetch and push, if you call .json()
, the content will be parsed as json, regardless of the content type of the response. If the content cannot be parsed as JSON, it throws/rejects.
It was originally called json()
etc but @marcoscaceres got us to change it to toJSON
as it is not async.
I am fine with throwing instead of returning null
Weird... PushEvent.data.json() is not async as well.
Wait, which color is the bikeshed today? 😂 but seriously, just pick one. I like Jake’s rationale, but I don’t mind what it’s called and definitely don’t want to debate about it.
Ok settled then... want to do the PR @beaufortfrancois ?
Sure!
If I understood correctly, we want to rename toJSON()
, toText()
and toArrayBuffer()
respectively to json()
, text()
, arrayBuffer()
.
Now, if we're adding json()
just for the sake of consistency with Push, I'd argue we should add blob()
but since it doesn't make sense.I'd vote for removing json()
and simply keep text()
and arrayBuffer()
.
If people complain that JSON.parse()
is too hard during the upcoming Origin Trial, how about adding it back later then?
people write json directly, so it does make sense to have a .json() method instead of having to manually call JSON.parse(). You could argue that if we have JSON.parse then we dont need to accept json and everyone should just call JSON.stringify but then people will end up doing that for the text record instead of using proper opaque record.
There is so little cost in adding .json() and we already have it implemented and it does offer convenience and makes writing and reading more consistent
You could argue that if we have JSON.parse then we dont need to accept json and everyone should just call JSON.stringify
This is how fetch works.
I'm not sure why removing NDEFRecord.json()
will empower web developers to use JSON.stringify
. They already have 3 ways to push JSON objects today. Unless we provide helpful documentation on why this is bad practise (encoding added, bigger size), I think people are going to use the easy way (#1
), regardless of the json()
method
// #1
writer.push(JSON.stringify({ key1: "value1" }));
// #2
writer.push({
records: [
{
recordType: "text",
data: JSON.stringify({ key1: "value1" })
}
]
});
// #3
writer.push({
records: [
{
recordType: "json",
mediaType: "application/json",
data: { key1: "value1" }
}
]
});
FWIW the "json" part is artificial, created by us in the Web NFC. Otherwise both `"opaque" and "json" map to Media type records that have a MIME type and we could just merge them to be "media" (replacing "json" and "opaque"). The simplicity of this is tempting.
I am asking myself, given that we can provide examples for good patterns, is the "json" use case so relevant that we want to keep it as a separate thing (for people coming from NFC side, it would be easier if we simply handled it as a media type).
Maybe "json" and "opaque" add enough semantics that they are useful, but I don't see people would have any problem invoking JSON.parse()
and JSON.stringify()
if examples/patterns would be given.
The more I think about it, the more I'd like us to get rid of JSON, and even other special casing. I feel like Web NFC is trying to be both an API and a library at the same time. So here's my proposal.
I believe great documentation will make this API shine.
Edited: "language" and "encoding" have been added to NDEFRecord Edited: "language" has been changed to "lang"
[Exposed=Window]
interface NDEFRecord {
constructor(NDEFRecordInit recordInit);
readonly attribute NDEFRecordType recordType;
readonly attribute USVString mediaType;
readonly attribute USVString id;
+ readonly attribute DataView? data;
+ readonly attribute USVString lang;
+ readonly attribute USVString encoding;
- USVString? text();
- [NewObject] ArrayBuffer? arrayBuffer();
- [NewObject] any json();
sequence<NDEFRecord> toRecords();
};
dictionary NDEFRecordInit {
NDEFRecordType recordType;
USVString mediaType;
USVString id;
+ USVString language; // defaults to `document.documentElement.lang` or "en-US" if unset
+ USVString encoding = "utf-8"; // MUST be "utf-8" or "utf-16"
any data;
};
TextDecoder
for decoding text tags is really easy and can be reused. Moreover, it allows web developers to specify the encoding (UTF-16 for instance) which, once again, gives back web developer control. This allows us to not try to make the API smart about it. Edited: TextEncoder is now used for "application/json"
async function writeToNfcTag() {
const encoder = new TextEncoder();
const writer = new NFCWriter();
const records = [
{
recordType: "text",
encoding: "utf-16",
data: Uint16Array.of(98, 111, 110, 106, 111, 117, 114), // "bonjour"
lang: "fr"
},
{
recordType: "url",
data: "https://google.com"
},
{
recordType: "media",
mediaType: "application/json",
data: encoder.encode(JSON.stringify({ key1: "value1", key2: "value2" }))
},
{
recordType: "media",
mediaType: "image/png",
data: await (await fetch("image.png")).arrayBuffer()
},
{
recordType: "android.com:pkg", // Known external type
data: encoder.encode("com.example")
},
{
recordType: "example.com:a", // Custom external type
data: Uint8Array.of(1)
}
];
return writer.push({ records });
}
function readNfcTag() {
const reader = new NFCReader();
reader.scan();
reader.onreading = ({ message }) => {
const decoder = new TextDecoder(); // UTF-8 by default
for (const record of message.records) {
const data = record.data; // NEW!
switch (record.recordType) {
case "text":
const textDecoder = new TextDecoder(record.encoding);
console.log(`Text: ${textDecoder.decode(data)} - Language: ${record.lang}`);
break;
case "url":
console.log(`URL: ${decoder.decode(data)}`);
break;
case "media":
// Let developer handle media case
if (record.mediaType === "application/json") {
console.log(`JSON: ${JSON.parse(decoder.decode(data))}`);
} else if (record.mediaType.startsWith("image/")) {
const blob = new Blob([data], { type: record.mediaType });
const img = document.createElement("img");
img.src = URL.createObjectURL(blob);
document.body.appendChild(img);
} else {
console.log(`Media not handled`);
}
break;
case "android.com:pkg":
console.log(`AAR Package Name: ${decoder.decode(data)}`);
break;
case "example.com:a":
console.log(`My custom external type: ${data.getUint8(0)}`);
break;
}
}
};
}
@beaufortfrancois Keep in mind that text is handled specially in NFC as it supports UTF-8 and UTF-16, so it is not so easy as to use a text decoder
const decoder = new TextDecoder(); // UTF-8 by default
It CANNOT decode UTF-16, which may exist in NDEF tags - we actually don't allow writing UTF-16, but at least we allow reading it due to existing tags
We could of course modify that we return a array buffer, but it's kind of lying then
Another suggestion, should we use data
or go with payload
which is closer to NDEF specs?
Also how should we handle embedded records?
data: new NDEFEncoder().encode([{
recordType: "local:a", // Local type
data: Uint8Array.of(1)
}]
interface NDEFDecoder {
sequence<NDEFRecord> decode(arrayBuffer);
};
interface NDEFEncoder {
[NewObject] ArrayBuffer encode(sequence<NDEFRecord>);
}
Text is a problem as it actually encodes language and encoding (UTF-8 or -16). Decoding this manually is actually quite hard
const header = data.getUint8(0, false);
const langCodeLength = header >> 2;
const utf8Encoded = header & 0x1;
const lang = new TextDecoder.decode(data.buffer.slice(1,langCodeLength));
const text = new TextDecoder(utf8Encoded ? "utf-8" : "utf-16be").decode(data.buffer.slice(langCodeLength + 1, data.buffer.length);
On the web platform we cannot encode UTF-16 but we can easily decode it.
Merge "json" and "opaque" recordType into new "media" recordType
They actually are both MIME type records.
If you feel like json
doesn't add much value, we can remove it.
I can do that in the current open PR, i.e. now :)
We can name MIME type records opaque
or mime
.
I don't have a strong opinion on the data
or payload
and whether we use getters or DataView. I like the current getters, but if there is a better alternative, I am open to that.
Thanks for all the feedback.
Also how should we handle embedded records?
I think we should keep it simple and have Javascript objects only for embedded records. In other words:
Edited: Add missing "JSON.stringify"
const encoder = new TextEncoder();
const jsonString = JSON.stringify({ key1: "value1", key2: "value2" });
const records = [
{
recordType: "media",
mediaType: "application/json",
data: encoder.encode(jsonString)
}
];
The above doesn't seem right.. json isn't text
We can name MIME type records
opaque
ormime
.
If media
doesn't fit, I'd prefer using mime
then as it is familiar with NFC developers (MIME media-type) and apply directly to a record type.
I don't have a strong opinion on the
data
orpayload
and whether we use getters or DataView. I like the current getters, but if there is a better alternative, I am open to that.
I'm not a huge fan of payload
though but doesn't feel strongly about it.
Note that is is not common in the web platform: https://cs.chromium.org/search/?q=payload;+f:idl$+f:blink&sq=package:chromium&type=cs unlike data
: https://cs.chromium.org/search/?q=data;+f:idl$+f:blink&sq=package:chromium&type=cs
I was suggesting payload because that is what NDEF calls it, but I guess that data
is fine if it is consistent.
I actually like media
but I still think it is worth filing an issue on the TAG design principles so know what is preferred. - we can go with media for now
We need to find out how to handle the special text record
const records = [
{
recordType: "text",
lang: "en-US",
encoding: "utf-8",
data: encoder.encode("a text string");
}
];
We should throw when encoding is different from utf-8 on other text based records. Also throw on the wrong kind of utf-16 (as there are two :) utf-16be and utf16-le)
case "text":
const decoder = new TextDecoder(record.encoding);
console.log(`Text: ${decoder.decode(record.data)}, language ${record.lang}`);
break;
We need to find out how to handle the special text record
const records = [ { recordType: "text", lang: "en-US", encoding: "utf-8", data: encoder.encode("a text string"); } ];
How about making optional lang
(default to user agent default locale) and encoding
(default to utf-8) and keeping data
a string for "text" recordType?
Note that this improvement is orthogonal to this issue, but still needed for "text" recordType as web developers needs to be able to set lang
and encoding
.
// Both would be equivalent for an english user.
const records = [
{
recordType: "text",
lang: "en-US",
encoding: "utf-8",
data: "a text string",
}
];
const records = [
{
recordType: "text",
data: "a text string",
}
];
For info, createTextRecord
on Android uses UTF-8 while Core NFC (iOS) seems to use UTF-16 encoding.
We should throw when encoding is different from utf-8 on other text based records. Also throw on the wrong kind of utf-16 (as there are two :))
case "text": console.log(`Text: ${new TextDecoder(record.encoding).decode(record.data)}, language ${record.lang}`); break;
Exposing those seems good to me. Thank you!
I actually like
media
but I still think it is worth filing an issue on the TAG design principles so know what is preferred.
I've filed one at https://github.com/w3ctag/design-principles/issues/140
How about making optional
lang
(default to user agent default locale) andencoding
(default to utf-8) and keepingdata
a string for "text" recordType?
These are supposed to be optional. I am not sure what other APIs do about default... browser default locale? Could you check. Also maybe add these examples to your above API change suggestion
I've filed one at w3ctag/design-principles#140
Thanks!
I'd prefer media
too, but I was instructed we should not use it. Thanks for filing that issue.
How about making optional lang (default to user agent default locale) and encoding (default to utf-8) and keeping data a string for "text" recordType?
I have been thinking about the same. In the NDEF record, encoding and language code are in the first bytes of the PAYLOAD. So we can represent PAYLOAD split between lang
, encoding
and data
. In that sense, I would prefer using data
, since we will expose lang
and encoding
separately.
A timid question. Wouldn't it be more clear if we had separate definitions for different record types? Like in the example below (to be improved, but you get the idea).
[Exposed=Window]
interface NDEFRecord {
constructor(NDEFRecordInit recordInit);
readonly attribute NDEFRecordType recordType;
readonly attribute USVString mediaType;
readonly attribute USVString id;
readonly attribute DataView? data;
};
dictionary NDEFRecordInit {
required NDEFRecordType recordType;
USVString mediaType;
USVString id;
any data;
};
[Exposed=Window]
interface TextRecord: NDEFRecord {
constructor(NDEFTextRecordInit recordInit);
readonly attribute DOMString lang;
readonly attribute DOMString encoding;
USVString? text(); // for convenience
};
dictionary TextRecordInit: NDEFRecordInit {
required NDEFRecordType recordType = "text";
DOMString lang;
DOMString encoding;
};
[Exposed=Window]
interface SmartPosterRecord: NDEFRecord {
constructor(SmartPosterRecordInit init);
sequence<NDEFRecord> toRecords(); // this could also be an attribute
};
dictionary SmartPosterRecordInit: NDEFRecordInit {
required NDEFRecordType recordType = "smart-poster";
sequence<NDEFRecord> records;
};
[Exposed=Window]
interface ExternalTypeRecord: NDEFRecord {
constructor(ExternalTypeRecordInit init);
sequence<NDEFRecord> toRecords(); // this should be a getter
[NewObject] ArrayBuffer? arrayBuffer();
};
dictionary ExternalTypeRecordInit: NDEFRecordInit {
required NDEFRecordType recordType = "external";
sequence<NDEFRecord> records; // nesting allowed; optional
};
// and so on for other types
OK, it's a lot to spec and implement.
It is better in terms of spec work but it makes it harder to read I think.
I don't it makes a big difference for the developer, and it is not common for people to get different types back and having to do type check (unless you know what is available, but then we are back to what we have today)
How about making optional
lang
(default to user agent default locale) andencoding
(default to utf-8) and keepingdata
a string for "text" recordType?These are supposed to be optional. I am not sure what other APIs do about default... browser default locale? Could you check. Also maybe add these examples to your above API change suggestion
I didn't find any locale/language/lang default parameter. We may want to not add IDL default but simply spec that "language" will be browser user agent locale if not specified. WDYT?
I've updated my API change suggestion above.
I found:
If the attribute value is the empty string (lang=""), the language is set to unknown; if the language tag is not valid according to BCP47, it is set to invalid.
I believe most NDEF APIs defaults to en-US but will have to check. Ah actually this is what Android does:
encoding: "utf-16",
utf-16 is not explicitly enough, it has to be either utf-16be or utf-16le. Have to check what NFC supports.
The encoding spec seems to categorize utf-16 as utf-16be
Actually the other way around (thanks to missing CSS styles)
It actually depends on whether NFC stores the byte order mark (BOM) or not
I found:
If the attribute value is the empty string (lang=""), the language is set to unknown; if the language tag is not valid according to BCP47, it is set to invalid.
I believe most NDEF APIs defaults to en-US but will have to check. Ah actually this is what Android does:
That's what I meant. There are no IDL defaults (e.g. lang = "en-uS"
). What do you recommend?
Actually the other way around (thanks to missing CSS styles)
It actually depends on whether NFC stores the byte order mark (BOM) or not
Why not have "utf-16" as a valid encoding for NDEFRecordInit and let implementation figure out the appropriate one? When reading back though, record.encoding is "utf-16be" or "utf-16le" and matches TextDecoder. WDYT?
I don't like much using browser language because it will surprise developers as they will most likely just test with their own locale. So en-US seems like a valid default, unless people declaratively (in HTML) etc set the language.
document language:
<html lang="en">
...
</html>
document.documentElement.lang
This differs from UA (browser) language
navigator.language
Let's pick document.documentElement.lang
. I've updated API changes suggestion.
Regarding UTF-16 in NDEF:
The Unicode BOM (byte order mark) MUST be tolerated (ie. no error conditions).
That is what TextDecoder does, but maybe we need to detect then - if it has BOM its "utf-16".
When writing a text record the BOM MAY be omitted, in that case the order shall be big endian ("utf-16be")
Basically, we allow people to write "utf-16" and "utf-16be" but they are on their own on how they do that as TextEncoder won't help writing :-) (libraries do though)
When reading and detecting UTF-16 we probably need to check whether it has a BOM or not.
or maybe it is fine to just say "utf-16" in that case, as you can decode with
TextDecoder("utf-16be", { ignoreBOM: false })
and it should just work. Good example though
It is better in terms of spec work but it makes it harder to read I think.
I don't think it makes a big difference for the developer, and it is not common for people to get different types back and having to do type check
Right now we need to expose a union of properties and getters across all record types. That is confusing as well: for instance Generic Sensors didn't dump all possible properties into a single object for a reason.
Even now scripts need to check for recordType
, there is no change there. But we'd get the self-documenting clarity on what getters are supposed to work etc, and record creation would become better defined as well.
Anyway it would be a big change.
@zolkis this is very few properties we are talking about, just expose them everywhere and make them null when then are irrelevant
OK I think I understand the UTF-16 more or less now. NFC only writes in big endian. When you write the BOM you always write 0xFEFF and that gets swapped if the decoder and encoder have different endianess.
So this works fine on my little endian machine:
Guess that means that we should always use "utf-16be" and write that as the encoding when we detect utf-16.
Trying to explain endianess for the spec:
The byte order of NDEF is big endian, which means that everything is read back as big endian. For UTF-16, byte order matters as it might differ between reading and writing. For this reason UTF-16 usually has a byte order mark (BOM) which is 0xFEFF, meaning that if the byte order differs between host machine and NDEF, then the value will be read back as 0xFFFE indicating that the byte order should be swapped. In the case that no BOM is present, UTF-16BE (big endian) encoding should be assumed.
Is
NDEFRecord.toJSON()
doing more thanJSON.parse(NDEFRecord.toText())
?I guess it doesn't raise an error when payload is not a valid JSON message butI'd like to hear from you what is its value.At a certain extent,
toArrayBuffer()
is all it's needed to get text/json as well.