Open 35C4n0r opened 1 year ago
@35C4n0r Can you elaborate on what issue you are facing with details? Which exact files are a problem to parse?
@35C4n0r Can you elaborate on what issue you are facing with details? Which exact files are a problem to parse?
@pombredanne I was just creating this on the Project Board, accidently converted this to an Issue. Anyways I've added a proper description.
@35C4n0r thanks! I was just curious! I sometimes do this too. Now I wonder if we ever care about the XML tag case? Because if so you do not even need to build a mapping at all.
In order to parse XML documents we will be using BeautifulSoup4 and
html.parser
. Now we are using this option instead of the Python's built in XML Parser is because at times the XML that has to be parsed is malformed and thehtml.parser
is linient in parsing, whereas the standard library only handle well-formed XML.There is an issue with this approach:
In order to deal with this:
Example:
The Mapping:
The new XML
After parsing it with BeautifullSoup
After using the map to convert the tags back