carj / pyPreservica

Python language binding for the Preservica API
https://pypreservica.readthedocs.io
Apache License 2.0
14 stars 2 forks source link

xml.etree.ElementTree.findall('.//{*}ElementName') (and .find('.//{*}ElementName) does not match namespaces? #2

Closed gitrauno closed 3 years ago

gitrauno commented 3 years ago

Hi James!

I have a issue since upgrading to v6.2 and updating pyPreservica via pip for that too.

Using EntityAPI functions children() or descendants() throws an error for me: builtins.AttributeError: 'NoneType' object has no attribute 'text'

Example code:

#!/usr/bin/python3
# -*- coding: utf-8 -*-
# I need to set environment variable for Requests to connect with self-signed certificates
import os
os.environ['REQUESTS_CA_BUNDLE'] = os.path.join(os.getcwd(), 'preservica.crt')
# Example:
import pyPreservica
print(pyPreservica.__version__) # prints "0.8.6"
client = pyPreservica.EntityAPI(username="rauno", password="redacted", tenant="EE", server="preservica62.rauno")
root_folders = client.children()
print(root_folders)

Output (error):

File "C:\Users\rauno\Desktop\2020-11-13 pyPreservica\02_pyPreservica.py", line 9, in <module>
  root_folders = client.children()
File "c:\Users\rauno\AppData\Local\Programs\Python\Python36-32\Lib\site-packages\pyPreservica\entityAPI.py", line 1027, in children
  return PagedSet(result, has_more, total_hits.text, url)

builtins.AttributeError: 'NoneType' object has no attribute 'text'

So I tried tracking the issue down and I think it's related to xml.etree.ElementTree.findall() usage in EntityAPI.py line 1010 I don't think "{*}" as namspace is working. My testing:

#!/usr/bin/python3
# -*- coding: utf-8 -*-
# For using self-signed sertificate
import os
os.environ['REQUESTS_CA_BUNDLE'] = os.path.join(os.getcwd(), 'preservica.crt')
# Fetching XML
import requests
import lxml.etree # For pretty-printing

myobj = {"username" : "rauno",
         "password" : "redacted",
         "tenant" : "ee"}
base_url = "preservica62.rauno"
login_request = requests.post(f'https://{base_url}/api/accesstoken/login', data = myobj)
pat = login_request.json()["token"]
header_pat = {"Preservica-Access-Token" : pat}
root_children_request = requests.get(f'https://{base_url}/api/entity/root/children', 
                                     data={"start":0,"max":50}, headers=header_pat)
# Response-XML as Bytes
xml_content_response = root_children_request.content
print("XML Response:")
print(lxml.etree.tostring(lxml.etree.fromstring(xml_content_response), pretty_print=True).decode("UTF-8"))

# Showing problem
import xml.etree.ElementTree
xml_fromstring = xml.etree.ElementTree.fromstring(xml_content_response.decode("utf-8"))
children = xml_fromstring.find(".//{*}Child")
print("With {*}:",children)
children_with_namespace = xml_fromstring.find(".//{http://preservica.com/EntityAPI/v6.2}Child")
print("With namespace:", children_with_namespace)
children_with_prefix_namespace = xml_fromstring.find(".//{ent}Child", namespaces={"ent":"http://preservica.com/EntityAPI/v6.2"})
print("With prefix:", children_with_prefix_namespace)

Output (no errors, but I cropped out SubjectAltNameWarnings):

[...]
XML Response:
<ChildrenResponse xmlns="http://preservica.com/EntityAPI/v6.2" xmlns:xip="http://preservica.com/XIP/v6.2">
  <Children>
    <Child title="Testijuurikas" ref="bb5f4634-7610-49d7-8f1a-235b69f3eae2" type="SO">https://preservica62.rauno/api/entity/structural-objects/bb5f4634-7610-49d7-8f1a-235b69f3eae2</Child>
    <Child title="Vastuv&#245;tujuurikas" ref="61d64c16-6b74-4bb7-bd82-8b786782b868" type="SO">https://preservica62.rauno/api/entity/structural-objects/61d64c16-6b74-4bb7-bd82-8b786782b868</Child>
  </Children>
  <Paging>
    <TotalResults>2</TotalResults>
  </Paging>
  <AdditionalInformation>
    <Self>https://preservica62.rauno/api/entity/root/children</Self>
  </AdditionalInformation>
</ChildrenResponse>

With {*}: []
With namespace: [<Element '{http://preservica.com/EntityAPI/v6.2}Child' at 0x03FAF5D0>, <Element '{http://preservica.com/EntityAPI/v6.2}Child' at 0x03FAF600>]
With prefix: [<Element '{http://preservica.com/EntityAPI/v6.2}Child' at 0x03FAF5D0>, <Element '{http://preservica.com/EntityAPI/v6.2}Child' at 0x03FAF600>]

And for me, the {*} does not match anything, so functions findall() return empty list and find() returns None - which is where the error comes from. This {*} prefix is used on other find and findall functions too.

Or if it's fine on your end, could this be an environmental issue on my side?

All the best Rauno

carj commented 3 years ago

Hi

Which version of Python are you using? I only have a 3.8.x system to test against.

~ James

On Fri, 13 Nov 2020 at 13:38, Rauno notifications@github.com wrote:

Hi James!

I have a issue since upgrading to v6.2 and updating pyPreservica via pip for that too.

Using EntityAPI functions children() or descendants() throws an error for me: builtins.AttributeError: 'NoneType' object has no attribute 'text'

Example code:

!/usr/bin/python3# -- coding: utf-8 --# I need to set environment variable for Requests to connect with self-signed certificatesimport osos.environ['REQUESTS_CA_BUNDLE'] = os.path.join(os.getcwd(), 'preservica.crt')# Example:import pyPreservicaprint(pyPreservica.version) # prints "0.8.6"client = pyPreservica.EntityAPI(username="rauno", password="redacted", tenant="EE", server="preservica62.rauno")root_folders = client.children()print(root_folders)

Output (error):

File "C:\Users\rauno\Desktop\2020-11-13 pyPreservica\02_pyPreservica.py", line 9, in root_folders = client.children() File "c:\Users\rauno\AppData\Local\Programs\Python\Python36-32\Lib\site-packages\pyPreservica\entityAPI.py", line 1027, in children return PagedSet(result, has_more, total_hits.text, url)

builtins.AttributeError: 'NoneType' object has no attribute 'text'

So I tried tracking the issue down and I think it's related to xml.etree.ElementTree.findall() usage in EntityAPI.py line 1010 https://github.com/carj/pyPreservica/blob/c94d1e8645e2d5f49e461d8898632ea0353b3595/pyPreservica/entityAPI.py#L1010 I don't think "{*}" as namspace is working. My testing:

!/usr/bin/python3# -- coding: utf-8 --# For using self-signed sertificateimport osos.environ['REQUESTS_CA_BUNDLE'] = os.path.join(os.getcwd(), 'preservica.crt')# Fetching XMLimport requestsimport lxml.etree # For pretty-printing

myobj = {"username" : "rauno", "password" : "redacted", "tenant" : "ee"}base_url = "preservica62.rauno"login_request = requests.post(f'https://{base_url}/api/accesstoken/login', data = myobj)pat = login_request.json()["token"]header_pat = {"Preservica-Access-Token" : pat}root_children_request = requests.get(f'https://{base_url}/api/entity/root/children', data={"start":0,"max":50}, headers=header_pat)# Response-XML as Bytesxml_content_response = root_children_request.contentprint("XML Response:")print(lxml.etree.tostring(lxml.etree.fromstring(xml_content_response), pretty_print=True).decode("UTF-8"))

Showing problemimport xml.etree.ElementTreexml_fromstring = xml.etree.ElementTree.fromstring(xml_content_response.decode("utf-8"))children = xml_fromstring.find(".//{}Child")print("With {}:",children)children_with_namespace = xml_fromstring.find(".//{http://preservica.com/EntityAPI/v6.2}Child")print("With namespace:", children_with_namespace)children_with_prefix_namespace = xml_fromstring.find(".//{ent}Child", namespaces={"ent":"http://preservica.com/EntityAPI/v6.2"})print("With prefix:", children_with_prefix_namespace)

Output (no errors, but I cropped out SubjectAltNameWarnings):

[...] XML Response:

https://preservica62.rauno/api/entity/structural-objects/bb5f4634-7610-49d7-8f1a-235b69f3eae2 https://preservica62.rauno/api/entity/structural-objects/61d64c16-6b74-4bb7-bd82-8b786782b868 2 https://preservica62.rauno/api/entity/root/children

With {*}: [] With namespace: [<Element '{http://preservica.com/EntityAPI/v6.2}Child' at 0x03FAF5D0>, <Element '{http://preservica.com/EntityAPI/v6.2}Child' at 0x03FAF600>] With prefix: [<Element '{http://preservica.com/EntityAPI/v6.2}Child' at 0x03FAF5D0>, <Element '{http://preservica.com/EntityAPI/v6.2}Child' at 0x03FAF600>]

And for me, the {} does not match anything, so functions findall() return empty list and find() returns None - which is where the error comes from. This {} prefix is used on other find and findall functions too.

Or if it's fine on your end, could this be an environmental issue on my side?

All the best Rauno

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/carj/pyPreservica/issues/2, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFU5IXMYUBXYG42QXGFEL3SPUZGTANCNFSM4TUS4OOQ .

gitrauno commented 3 years ago

Hi!
You're absolutely correct! I'm using version 3.6 and I also have 3.7 setup, where I've worked with xml.etree.ElementTree and variants, but:

Changed in version 3.8: Support for star-wildcards was added.

1) Python 3.8 docs on xml.etree.elementtee
2) What's new in Python 3.8

When searching for solutions I searched for Python .//{*}, but if I had searched just Python {*} I would've seen my mistake.
Time to upgrade for me :)

Sorry to make this issue and thanks for a quick response!
Rauno

carj commented 3 years ago

Hi,

Glad you found the issue. I would like to get some older python distributions installed at some point to check functionality. I have suite of system tests which i run on my 3.8.5 system before i create new releases.

For the time being i can only really support 3.8.x

~ James