martinblech / xmltodict

Python module that makes working with XML feel like you are working with JSON
MIT License
5.49k stars 462 forks source link

Xmltodict 0.13.0 likely broken (or moto use of it) #297

Closed potiuk closed 2 years ago

potiuk commented 2 years ago

The XMLtodict 0.13.0 is likely broken (or moto use of it)

We have a sophisticated test harness in Airlfow where we run our tests for latest releases of our dependencies and it caught a strange eror.

I investigated it and traced it to xmltodict 0.13.0 upgrade.

I am not 100% sure whether it's the xmltodict or the way how boto/moto uses it but wanted to give an early warning in case you can investigate it faster.

The problem is that we are starting to ge "ValueError Malformatted input" when 0.13.0 is installed in some of our Amazon EMR tests:

tests/providers/amazon/aws/hooks/test_emr.py::TestEmrHook::test_create_job_flow_extra_args: ValueError: Malformatted input
tests/providers/amazon/aws/hooks/test_emr.py::TestEmrHook::test_create_job_flow_uses_the_emr_config_to_create_a_cluster: ValueError: Malformatted input
tests/providers/amazon/aws/hooks/test_emr.py::TestEmrHook::test_get_cluster_id_by_name: ValueError: Malformatted input

The error is thrown in moto in this part of the code:

         od = OrderedDict()
          for k, v in value.items():
              if k.startswith("@"):
                  continue

              if k not in spec:
                  # this can happen when with an older version of
                  # botocore for which the node in XML template is not
                  # defined in service spec.
                  log.warning("Field %s is not defined by the botocore version in use", k)
                  continue

              if spec[k]["type"] == "list":
                  if v is None:
                      od[k] = []
                  elif len(spec[k]["member"]) == 1:
                      if isinstance(v["member"], list):
                          od[k] = transform(v["member"], spec[k]["member"])
                      else:
                          od[k] = [transform(v["member"], spec[k]["member"])]
                  elif isinstance(v["member"], list):
                      od[k] = [transform(o, spec[k]["member"]) for o in v["member"]]
                  elif isinstance(v["member"], OrderedDict):
                      od[k] = [transform(v["member"], spec[k]["member"])]
                  else:
  >                   raise ValueError("Malformatted input")
  E                   ValueError: Malformatted input

It's rather easy to reproduce:

1) checkout airflow from https://github.com/apache/airflow 2) install breeze (pipx install -e ./dev/breeze) - airflow development environment 3) run breeze - this will pull and build the docker image that has all necessary deps 4) run pip freeze | grep xmltodict - see that 0.12.0 version is installed 5) once in the container run pytest tests/providers/amazon/aws/hooks/test_emr.py -> they shoudl succed 6) upgrade xmltodoc : pip install xmltodict==0.13.0 7) run pytest tests/providers/amazon/aws/hooks/test_emr.py -> they shoudl fail with the "Malformed input"

Once again 0 I am not sure if the root cause is moto/boto use of xmltodic or xmltodict itself, but I decided to open issues in both - maybe you can investigate and fix the root cause quicker.

Related: https://github.com/apache/airflow/issues/23576 Related: https://github.com/spulec/moto/issues/5112

martinblech commented 2 years ago

Please send a pull request with a new test case that exercises this failure in xmltodict.

potiuk commented 2 years ago

I am unable to - without a lot of investigation.

With this issue I just wanted to let you that there is a potential bug in there.I just pinned xmltodoc to 0.12.0 in Airflow and maybe we are going to investigate further (but we have 70 providers and ~600 dependencies in Airflow - this is just one of transitive dependencies of our dependencies., so I am not sure if we can investigate all such issues down to a test case.

Fine if you close it - it makes no difference to us as we simply pin our dependenceis there. Hopefully it will not bite you back. or maybe someone will investigate it and you will get it fixed..

For me - I just warned you, you can close the bug, no problem (or rather - it's likely still your problem, but not ours for now)..

bblommers commented 2 years ago

Moto maintainer here - the failure was due to the xmltodict's change to swap OrderedDict for dict in Python >= 3.7. So it's not so much a bug as an unexpected change in behaviour.

In case anyone else runs into this, we fixed this on our end in https://github.com/spulec/moto/pull/5111. Moto >= 3.1.9.dev3 should be fully compatible with xmltodict 0.13.0.

ggainey commented 2 years ago

I just tripped over this as well - in case it's helpful to anyone else, here's how I'm addressing it : pulp/pulp_rpm#2520