drmohundro / SWXMLHash

Simple XML parsing in Swift
MIT License
1.41k stars 205 forks source link

detect end tag #129

Closed Rufy86 closed 7 years ago

Rufy86 commented 7 years ago

Hi everyone,

I'm looping on JMdict xml file that Is made in this way:

<JMdict>
    <entry>
        <ent_seq>1000000</ent_seq>
        <r_ele>
            <reb>ヽ</reb>
        </r_ele>
        <r_ele>
            <reb>くりかえし</reb>
        </r_ele>
        <sense>
            <pos>&n;</pos>
            <gloss>repetition mark in katakana</gloss>
            <gloss xml:lang="ita">simbolo di ripetizione in katakana</gloss>
        </sense>
    </entry>
    <entry>
        <ent_seq>1000010</ent_seq>
        <r_ele>
            <reb>ヾ</reb>
        </r_ele>
        <r_ele>
            <reb>くりかえし</reb>
        </r_ele>
        <sense>
            <pos>&n;</pos>
            <gloss>voiced repetition mark in katakana</gloss>
        </sense>
    </entry>
........
</JMdict>

I'm using this code

func enumerate(indexer: XMLIndexer) {
  for child in indexer.children {
    print(child.element!.name)
    enumerate(child)
  }
}

enumerate(indexer: xml)

but, because I would like transform this xml in a array of dictionary in which every dictionary is the content of tag , how can I check and detect the end of a tag, in this case, </entry>?

drmohundro commented 7 years ago

I'm not sure I'm understanding - are you wanting a Dictionary[String: String] where entry would be a key to a string that contains the XML literal? Or are you wanting a dictionary that contains sub-dictionaries? I'm just trying to understand the structure you're wanting to populate.

Rufy86 commented 7 years ago

I would like this let JMdict = Array<Dictionary<String, AnyObject>> in which the dictionary contains what there is between the tags <entry> and </entry>. the value of the dictionary is AnyObject because some elements are multiple(eg. language). then, I would like insert these in array or dictionary. to do this, I need loop on child and when the entry is over, close all dictionary and array, insert them in the entry dictionary and add this in JMdict array.

I hope to have explained better

drmohundro commented 7 years ago

So... an array of a dictionary keyed by string. So, given what you've got above, what would the first slot in the array be? I'm wondering if what you're wanting is instead just Dictionary<String, AnyObject> instead of an array of dictionary.

So, something like this structure?

{
    "1000000" : { r_ele: '', sense: { pos: '', gloss: '' } },

    "1000010" : { r_ele: '', sense: { pos: '', gloss: 'voiced repetition mark in katakana' } }

That's more of a JSON-like view, but maybe it communicates what you're looking for? I'm assuming you're wanting ent_seq to be the key to your dictionary? Is that correct?

Rufy86 commented 7 years ago

not exatly:

[["id_database":"1000000", "r_ele" : ["ヽ", "くりかえし"] , "sense" : ["en":"repetition mark in katakana","ita":"simbolo di ripetizione in katakana"]], ["id_database":"1000010", "r_ele" : ["ヽ", "くりかえし"] , "sense" : ["en":"voiced repetition mark in katakana"]]]

an array of dictionary. every dictionary contains the datas of one entry

Rufy86 commented 7 years ago

obviously, JMdict xml file is more complicate. but the principle is that: transform xml file in a array of dictionary, in which every dictionary contains the info of one entry

drmohundro commented 7 years ago

I'll admit that it sounds like the structure you're wanting to populate sounds an awful lot like the internal structure that SWXMLHash uses. I'm not sure if you could use the XMLIndexer instead, but I did want to mention that the underlying structure is similar. It uses a stack of XMLElement and each XMLElement contains its own stack of XMLElements.

Okay, well I'll try to throw out some general pseudo code to help think through how you might could handle this...

func enumerateEntry(subDict: Dictionary<String, String>, entryXML: XMLIndexer) {
    for child in entryXML.children {
        subDict[child.element!.name] = child.element!.text
    }
}

var dict: Dictionary<String, Dictionary<String, String>>
for database in xml["entry"] {
    subDict = Dictionary<String, String>()
    enumerateEntry(subDict, database)
    dict[database["ent_seq"].element!.text] = subDict
}
drmohundro commented 7 years ago

Hey, just checking in - were you ever able to resolve this issue?