eclipse-archived / smarthome

Eclipse SmartHome™ project
https://www.eclipse.org/smarthome/
Eclipse Public License 2.0
862 stars 787 forks source link

Fix IllegalArgumentException due to wrong micro character in units #6818

Closed wborn closed 5 years ago

wborn commented 5 years ago

Whenever µg/m³ and μW/cm² are used in a QuantityType an IllegalArgumentException occurs when the string values are parsed while creating item events (https://github.com/openhab/openhab2-addons/pull/4507#discussion_r245507046):

2019-01-03 20:45:33.125 [ERROR] [me.core.internal.events.EventHandler] - Creation of ESH-Event failed, because one of the registered event factories has thrown an exception: Error invoking #valueOf(String) on class 'org.eclipse.smarthome.core.library.types.QuantityType' with value '2.60 μg/m³'.
java.lang.IllegalStateException: Error invoking #valueOf(String) on class 'org.eclipse.smarthome.core.library.types.QuantityType' with value '2.60 μg/m³'.
        at org.eclipse.smarthome.core.items.events.ItemEventFactory.parseSimpleClassName(ItemEventFactory.java:190) ~[?:?]
        at org.eclipse.smarthome.core.items.events.ItemEventFactory.parseType(ItemEventFactory.java:158) ~[?:?]
        at org.eclipse.smarthome.core.items.events.ItemEventFactory.getState(ItemEventFactory.java:136) ~[?:?]
        at org.eclipse.smarthome.core.items.events.ItemEventFactory.createStateEvent(ItemEventFactory.java:116) ~[?:?]
        at org.eclipse.smarthome.core.items.events.ItemEventFactory.createEventByType(ItemEventFactory.java:80) ~[?:?]
        at org.eclipse.smarthome.core.events.AbstractEventFactory.createEvent(AbstractEventFactory.java:51) ~[?:?]
        at org.eclipse.smarthome.core.internal.events.EventHandler.createESHEvent(EventHandler.java:121) ~[?:?]
        at org.eclipse.smarthome.core.internal.events.EventHandler.handleEvent(EventHandler.java:95) ~[?:?]
        at org.eclipse.smarthome.core.internal.events.EventHandler.handleEvent(EventHandler.java:72) ~[?:?]
        at org.eclipse.smarthome.core.internal.events.ThreadedEventHandler.lambda$0(ThreadedEventHandler.java:67) ~[?:?]
        at java.lang.Thread.run(Thread.java:748) [?:?]
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?]
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:?]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]
        at java.lang.reflect.Method.invoke(Method.java:498) ~[?:?]
        at org.eclipse.smarthome.core.items.events.ItemEventFactory.parseSimpleClassName(ItemEventFactory.java:181) ~[?:?]
        ... 10 more
Caused by: java.lang.IllegalArgumentException: μg not recognized (in 2.60 μg/m³ at index 5)
        at tec.uom.se.quantity.Quantities.getQuantity(Quantities.java:80) ~[?:?]
        at org.eclipse.smarthome.core.library.types.QuantityType.<init>(QuantityType.java:96) ~[?:?]
        at org.eclipse.smarthome.core.library.types.QuantityType.valueOf(QuantityType.java:139) ~[?:?]
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?]
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:?]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]
        at java.lang.reflect.Method.invoke(Method.java:498) ~[?:?]
        at org.eclipse.smarthome.core.items.events.ItemEventFactory.parseSimpleClassName(ItemEventFactory.java:181) ~[?:?]
        ... 10 more

These units got introduced in https://github.com/eclipse/smarthome/pull/6563 and https://github.com/eclipse/smarthome/pull/6575.

The issue is that Unicode defines a "Micro sign" and a "Greek Small Letter Mu" character. In the UoM library the "Micro sign" is used whereas the PRs added units using the "Greek Small Letter Mu" characters.

See also the list of Unicode characters:

Code Glyph Decimal Description #
U+00B5 µ 0181 Micro sign 0117
U+03BC μ 956 Greek Small Letter Mu 0428

See also the Wikipedia page about Duplicate characters in Unicode.

kaikreuzer commented 5 years ago

That's nasty. I would assume that people will use the greek mu when creating rules as this is what you get from the keyboard shortcut (on Mac alt+m), right? So this won't work then anymore, which might be an issue...

wborn commented 5 years ago

I'm not a Mac user myself, so it's a bit hard to test for me. ;-)

Though if the Mac shortcut creates the "Greek Small Letter Mu" you will already have issues with the default UoM micro symbol when using it with grams:

QuantityType.valueOf("1 μg")

Throws:

java.lang.IllegalArgumentException: μg not recognized (in 1 μg at index 2)
    at tec.uom.se.quantity.Quantities.getQuantity(Quantities.java:80)
    at org.eclipse.smarthome.core.library.types.QuantityType.<init>(QuantityType.java:96)
    at org.eclipse.smarthome.core.library.types.QuantityType.valueOf(QuantityType.java:139)
...

Whereas QuantityType.valueOf("1 µg") works.

J-N-K commented 5 years ago

That sounds very irritating for the user. Can‘t we define both as equivalent? For units itself we already have that (e.g. mbar and hPa). Just an idea without looking at the code.

hakan42 commented 5 years ago

At least in my use case (the sensebox binding), the upstream data source sends me the greek small letter. When I came up with the PRs, I fetched a json dump with curl, and copy&pasted the mu sign from there.

Which then further propagated into my channel definitions (for UoM), the code, the README, and so on.

wborn commented 5 years ago

I found a Mac user who pressed "Mac alt+m" and it resulted in the correct "Micro sign". :smile: It doesn't look like you can easily extend the UoM library to use additional/equivalent metric prefixes.

kaikreuzer commented 5 years ago

Ok, thanks. So if I understand @hakan42 right, he will have to convert the character when receiving it from the "upstream data source", but then all processing should be fine when this PR is merged, right?

hakan42 commented 5 years ago

Yes, I would convert the character, and, to be safe, write all hardcoded character strings in their Unicode representation. You never know when someone falls into the same pitfall again :)

Which makes me think: @wborn , could you write the mu sign in the smarthome java code with the proper unicode representation? I just want to be sure that when someone saves the java file in their preferred editor, nothing bad will happen to the mus again.

Also, I should / will create a PR against the smarthome documentation for this issue so the next developer will know that there is a possible pitfall.

wborn commented 5 years ago

could you write the mu sign in the smarthome java code with the proper unicode representation?

All ESH docs and code use the right symbol except for the lines that are fixed in this PR:

smarthome$ grep -nr 'μ' --include={*.md,*.xml,*.java}
bundles/core/org.eclipse.smarthome.core/src/main/java/org/eclipse/smarthome/core/library/unit/SmartHomeUnits.java:175:        SimpleUnitFormat.getInstance().label(MICROWATT_PER_SQUARE_CENTIMETRE, "μW/cm²");
bundles/core/org.eclipse.smarthome.core/src/main/java/org/eclipse/smarthome/core/library/unit/SmartHomeUnits.java:177:        SimpleUnitFormat.getInstance().label(MICROGRAM_PER_CUBICMETRE, "μg/m³");
bundles/core/org.eclipse.smarthome.core.test/src/test/java/org/eclipse/smarthome/core/types/SmartHomeUnitsTest.java:277:        assertThat(SmartHomeUnits.MICROGRAM_PER_CUBICMETRE.toString(), is("μg/m³"));
bundles/core/org.eclipse.smarthome.core.test/src/test/java/org/eclipse/smarthome/core/types/SmartHomeUnitsTest.java:289:        assertThat(SmartHomeUnits.MICROWATT_PER_SQUARE_CENTIMETRE.toString(), is("μW/cm²"));

In openhab2-addons the wrong symbol is used in:

openhab2-addons$ grep -nr 'μ' --include={*.md,*.xml,*.java}
addons/binding/org.openhab.binding.sensebox/ESH-INF/thing/channels.xml:54:      "unit": "μW/cm²",
addons/binding/org.openhab.binding.sensebox/ESH-INF/thing/channels.xml:60:      <description>Current UV intensity in μW/cm²</description>
addons/binding/org.openhab.binding.sensebox/ESH-INF/thing/channels.xml:62:      <state readOnly="true" pattern="%.1f μW/cm²"></state>
addons/binding/org.openhab.binding.sensebox/README.md:77:Number  Zugspitze_UVIntensity      "Zugspitze UvIntensity [%.1f μW/cm²]"  <light>         (Zugspitze, Weather)       { channel="sensebox:box:zugspitze:measurements#uvIntensity" }
addons/binding/org.openhab.binding.airvisualnode/ESH-INF/thing/thing-types.xml:89:      <description>PM2.5 level, μg/m&#179;</description>
addons/binding/org.openhab.binding.airvisualnode/README.md:43:| pm_25           | Number:Density        | PM2.5 level, μg/m³          |

We could also prevent these issues by replacing the "Greek Small Letter Mu" with a "Micro sign" in the methods where a QuantityType is created/converted with such a unit string:

https://github.com/eclipse/smarthome/blob/9de1cb05fc0d221b05c729a5ffb75a31eb54bc82/bundles/core/org.eclipse.smarthome.core/src/main/java/org/eclipse/smarthome/core/library/types/QuantityType.java#L91-L97

https://github.com/eclipse/smarthome/blob/9de1cb05fc0d221b05c729a5ffb75a31eb54bc82/bundles/core/org.eclipse.smarthome.core/src/main/java/org/eclipse/smarthome/core/library/types/QuantityType.java#L210-L217

hakan42 commented 5 years ago

I amended the PR in the openhab2-addons repository:

https://github.com/openhab/openhab2-addons/pull/4507/commits/69eddb2b726d02f4f2ced62c13a54d214a51064b

maggu2810 commented 5 years ago

It doesn't look like you can easily extend the UoM library to use additional/equivalent metric prefixes.

Would be great if it could be supported (some time). We already had a similar problem with °C vs (the latter one is the unicode character for degree celsius, the first is a degree character and a upper C)

kaikreuzer commented 5 years ago

We could also prevent these issues by replacing the "Greek Small Letter Mu" with a "Micro sign" in the methods where a QuantityType is created/converted with such a unit string:

Might be a good safety measure to avoid issues with it in the future. But can be done as a separate PR.