Closed elkym closed 3 years ago
Good catch! When I originally developed that format method I hadn't dealt with namespaced XML so I was going off of the XML specification for valid tag characters. I can add support for colons in XML tags (I'm currently messing with XML so this is good timing!)
A very quick reply! Thanks!
I guess in the meantime, I'll have to do a workaround.
I'm planning on publishing v61 tonight, so this should be included as part of that build.
You know, a friend of mine was looking at the code with me (he's a Javascript professional, and I don't read it very well yet), and he suggested some refactors that would condense the script a bit. Would you be interested in seeing what he suggested? (He said I could forward it to you).
I mean, I don't want to pile a whole bunch on your plate or tell you how to write code-- but if what he drafted is useful to you, I'd hate for you to not get a chance to see it (or even just collaborate directly).
I'm always up for cleaner code! Feel free to send me suggestions or post it here and I'll take a look!
An attempt at posting the code here is formatting wonky. I'll ask him to point a link or something.
Oh, I was just adjusting whitespace and switching from var to const/let in one sample section. Every professor I've had preferred Allman braces, and every team I've worked on preferred Egyptian or One-True-Brace braces. I normally just let Prettier be the authority on modern code style.
Haha, I tend to lean Allman style as it makes pairing braces easier and adds spacing for readability. And yeah, code can be posted as is by wrapping it with three grave accents (the other char on the ~ key, which I guess mobile keyboards don't have so I can't post a true example currently). It should look something like:
''' YOUR CODE HERE '''
(But replace the apostrophes with accents)
As for the let/const declarations, apps script recently updated its runtime engine to V8 which finally allowed newer JavaScript keywords. Before V8 everything had to be a var which was super annoying. I've moved a few vars to be consts but I still need to take a pass at fully updating code.
After messing with this for a bit I think it's going to be far more complicated than I originally expected. Since I'm using XmlService to create XML I can't just ignore :
as it is reserved for namespace purposes so attempting to use something like xmlns:xsi
as an element or attribute's name will throw an error. Instead, I'll need to parse the cell value into separate values of the namespace string and the actual element name. In the case of the xmlns:xsi
example I'll need to separate that into xmlns
(namespace) and xsi
(name).
This may end up spiraling if I don't track the root element namespace which could cause namespace issues on child elements. I'll need to experiment with how XmlService handles creating and using namespaces since my initial tests don't seem to be yielding desired results. For the time being, I'm going to finish the XML work I had planned for v61 and push this to v62 so I can more thoroughly test it. This means you'll have to do a workaround for namespaces for now. Sorry for the inconvenience!
Thanks for the update. I wondered if it would be more complex-- I wasn't sure what all of the pieces you were relying on were. Good luck. Maybe I'll be decent enough at coding to lend you a hand sometime. (And maybe Grad School will crush me into small pieces first).
Question:
On line 344:
xmlName = xmlName.replace(/[^a-zA-Z0-9\-\_]/gm, replacement); //Replace non-alphanumeric, dash, underscore, or period chars with an underscore
Does this limitation accurately represent the only characters that can appear in an XML tag? Alternatively, does it accurately represent the only characters that can be processed via XMLService?
I largely pulled the valid chars from the specifications on w3schools. There's a section on that page talking about what chars are valid for XML element names. I should double check that the regex there also grabs spaces and dots. Note that these limitations do not apply to the value of a tag, just the tag name itself. A tag (or attribute) with the name Na|me
would be invalid, but the tag's value can be Na|me
without issue.
I was originally creating XML by manual string concatenation before I moved over to using XmlService. Even though adding namespace support would have been easier with manual XML generation, XmlService ensures that ESD produces valid XML every time so I'm going to have to figure out how best to handle namespaces through that. I think I had namespace generation working last night (in the sense that I was creating Namespace values), but XmlService didn't seem to like how I was doing it and threw errors on attributes created with a namespace.
Just pushed v61 and I'm thinking about how to handle this best. I think a good approach would be allowing users to set up a list of namespaces in the sidebar then allow key cells to use namespaces via prefixes separated by colons. This would probably look something like a reorderable list with input fields for the namespace prefix and URI:
Label | Field |
---|---|
Prefix | xhtml |
URI | http://www.w3.org/1999/xhtml |
That would add xmlns:xhtml="http://www.w3.org/1999/xhtml"
to the root element's tag. Once the user has declared the xhtml
namespace, they could use that prefix in key cells, for example: xhtml:test
Does that seem like a good approach or is there something else you had in mind?
It's a little clunky right now, but I do have the UI for namespaces roughly working. I've added a new XML section to contain namespace settings since it would have seriously bloated the Advanced XML section. I still need to get the namespace parsing on the sheets side set up, but it's a start!
XML namespace UI is looking much better now. Just have some minor UX to deal with there and I can move onto getting namespaces working in exported XML.
Things are looking good on namespace support. I'm hoping to get it finished up and ready to go in v62 this weekend. Had to jump through a few hoops to get multiple namespaces declared in a single element since XmlService
apparently doesn't support that (even though people have been asking for it since 2015 according to one support thread). I'm debating adding an option to set namespace declaration behavior, either in the root element like below or per element. Not sure if there's a preference there. In any case, I have this exporting from a test sheet:
<data xmlns="urn:schemas-microsoft-com:office:spreadsheet" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet">
<x:test1>
<XLSX>1</XLSX>
<ss:Type>Test</ss:Type>
</x:test1>
<test2>
<XLSX>0</XLSX>
<ss:Type>1.02</ss:Type>
</test2>
<test3>
<XLSX>true</XLSX>
</test3>
<test4>
<XLSX>false</XLSX>
</test4>
</data>
This is now implemented and will roll out with the v62 release.
On line 327 of the gs: function formatXmlName(value, replacement) is the only function that I can find which would seem to be responsible for all colons in tags with underscores (or hyphens, or periods).
There are a number of xml metadata formats that this is troublesome for.
Is this a feature that should be developed? Or am I creating this problem with a user error?
I'm not terribly familiar with Javascript and I'm new to coding, but I think I've puzzled this out reasonably well.
If this is a settings problem that I can solve that way, please do explain.
Thanks.