stoduk / PingSerial

Arduino library for using serial enabled ultrasonic distance modules (eg. US-100)
MIT License
17 stars 9 forks source link

Remove leading space from keywords.txt identifier #4

Open per1234 opened 6 years ago

per1234 commented 6 years ago

The Arduino IDE requires the use of a single true tab separator between the keyword name and identifier. When spaces are used rather than a true tab the keyword is not highlighted.

Reference: https://github.com/arduino/Arduino/wiki/Arduino-IDE-1.5:-Library-specification#keywords

stoduk commented 6 years ago

Have you tried raising this with the Arduino team? Strikes me as an obvious robustness bug on their side - "be generous with your inputs and strict with your outputs".

AFAICS there is no ambiguity here - KEYWORD is subject to C language's naming convention and so can contain no whitespace; KEYWORD_TOKENTYPE is a fixed string which cannot contain whitespace. To refuse all but a very specific whitespace is pointless pedantic on their part.

(I'm not blaming you, obviously, just suggesting we fix this in one place rather than in every Arduino library that has been written!)

per1234 commented 6 years ago

Yes, the issue has been raised (https://github.com/arduino/Arduino/pull/7245), but no good solution was settled on and there has not been any recent progress. It's a bit more complicated than you make it out to be because there are multiple fields and it's necessary to support empty fields. How are you going to do that while allowing arbitrary numbers of spaces?

Personally, I don't find simply following the specification so onerous but there are many hundreds of libraries with bad keywords.txt formatting so this is certainly a significant issue. Even after I submit a PR to fix bad formatting, I often find when I come back a few months later that it's been broken again. People seem incapable of resisting their OCD urges to line everything up all pretty.

stoduk commented 6 years ago

keywords.txt syntax is a real hot mess, I wish I could undo reading about that :)

I don't think KEYWORD or either of the two competing TOKENTYPE can contain whitespace. Can REFERENCE_LINK contain whitespace?

if not then a more robust parser would be to trim and then split by all whitespace, then based on the number of items:

I'd probably also change it so that we only ask for a single TOKENTYPE and it can contain either new or old format (they are mutually exclusive, so no ambiguity).

Anyway, if you'd rather I just approve this and forget about it then I'm happy to do that :D

per1234 commented 6 years ago

Can REFERENCE_LINK contain whitespace?

I just did a test, and yes it can. Currently the reference links all point to official Arduino reference pages and Arduino does not use spaces in those filenames. There is an official plan to allow documentation to be bundled with Arduino libraries and it's my hope that this will allow 3rd party library authors to use the REFERENCE_LINK field to links to their own bundled documentation. I don't think it would be a problem to just not support spaces in reference page filenames, as long as that's made clear in the documentation.

if not then a more robust parser would be to trim and then split by all whitespace, then based on the number of items

I think it's a reasonable proposal. It does certainly make the keywords.txt parsing code more complex but considering that around 2000 libraries had incorrect keywords.txt, I think smarter code is worth the extra maintenance burden.

It is currently possible to leave the KEYWORD_TOKENTYPE field empty, and I have seen a couple libraries that do this, but not many. As Arduino IDE 1.6.4 is pretty ancient now I think people will find it kind of ridiculous to need to fill that field when they update to the new four field format since RSYNTAXTEXTAREA_TOKENTYPE overrides it in any newer IDE version but it's worth doing it if only to avoid having another confusing empty field to maintain.

I'd probably also change it so that we only ask for a single TOKENTYPE and it can contain either new or old format (they are mutually exclusive, so no ambiguity).

I really don't like this bolted on RSYNTAXTEXTAREA_TOKENTYPE field. That's just super hacky and they didn't even bother to document it. Even the lead Arduino developer was hazy on what it was for. I ended up having to just figure it out through trial and error in order to document it in the Arduino library specification. I believe the idea was to provide backwards compatibility with Arduino IDE 1.6.4 and older, before the switch to using RSyntaxTextArea. But I think the solution was to simply continue using the old tokens. Maybe there are some good reasons why that wouldn't be possible but I can't image what they would be. Now we have this situation where less than 1% of libraries use the new tokens and the old tokens don't provide the same highlighting as they did in old IDE versions. People already have enough trouble getting two fields right so four fields, one of which is empty, would definitely be a disaster.

Anyway, if you'd rather I just approve this and forget about it then I'm happy to do that

I'm glad people are thinking about it. I personally don't have the ability to fix this problem in the IDE's Java code so my solution to the problem has been to submit a bunch of PRs to libraries with incorrect keywords.txt formatting with links to the specification in the hopes that this will bring some collective awareness to the Arduino community but I also understand that a more intelligent alternative would be to simply change a few lines of code in the Arduino IDE.

The elephant in the room is that we shouldn't need to have a keywords.txt file at all. The IDE should automatically detect keywords. It's terrible how every keyword for every installed library is highlighted, regardless of whether your code is using that library and how that word is used. It gets ridiculous once you have a a lot of libraries installed and every other word is highlighted.