Leonidas-from-XIV / node-xml2js

XML to JavaScript object converter.
MIT License
4.84k stars 598 forks source link

parseString callback doesn't always return an object #627

Closed andrewhodel closed 11 months ago

andrewhodel commented 2 years ago

Shouldn't the 2nd argument to cb() always return an object?

https://github.com/Leonidas-from-XIV/node-xml2js/blob/master/lib/parser.js#L303

It's a problem if you check for errors by validating a one layer deep field in the 2nd argument to cb() and there is no reason the callback cannot set this to {}.

andrewhodel commented 2 years ago
xml.parseString(''), function(err, result) {

  // no result object
  // should be result = {}

});
andrewhodel commented 2 years ago

@Leonidas-from-XIV

Leonidas-from-XIV commented 11 months ago

I think setting it to {} (a potentially valid value) makes it too easy to ignore that an error happened.

andrewhodel commented 11 months ago

@Leonidas-from-XIV that's not right.

Leonidas-from-XIV commented 11 months ago

How isn't it. If you expect an object, you'll get an object and think everything is ok, when in fact the parsing failed.

andrewhodel commented 11 months ago

Because there are layers and known expectations.

For example, TLS ensures by HTTP content length that all of the data was sent.

What you are saying could be caused by sending an object with missing fields, and the reality there is that you should be validating each field that's required but you aren't yet.

Leonidas-from-XIV commented 11 months ago

I'm a big fan of defense in depth, where every place tries to help the programmer not to make mistakes (where possible, in this case something like a result type would've been better but the damage is already done). So if you don't catch the error, don't pretend everything went fine by returning a potentially valid value.

If the fact that the value is undefined is creating problems, you can for sure create a wrapper around parseString that replaces undefined with {} but I don't thing that adding more footguns to this library is a good idea.

andrewhodel commented 11 months ago

@Leonidas-from-XIV and the argument there is that Python failed because of try catch blocks and exceptions.

Is that bank paying you to say that or are you being sincere, I think they are trying to stop work and create suburbs.

You are wrong about this, if the expected return value is of the type object then it should always be an object and the testing should be done inside the object.

That is why JavaScript uses the term Object.

andrewhodel commented 11 months ago

The fact is that the data structures are data structures and the remaining work is in each program. This is proven by modules and the count of function arguments eventually removing the validity of modules when complexity is enough to grant value.

What you are doing creates struggle and causes failure to all those who haven't been programming for decades and would already know to create a new language if these failures were normal.

I'm trying to help you with your parser.

It's a format conversion tool and the input type and output type should be the same (if the constructs allow as true in this situation) to allow debugging to be done at the correct level.

Leonidas-from-XIV commented 11 months ago

The correct level is: if the parsing failed, don't produce invalid data. It is as easy as that.

andrewhodel commented 11 months ago

That's not true, the function returns two values.

The first value is an error value and it indicates if there was a problem parsing the data.

You are asking for people using your module to add extra code other than the error handling.

The expected use is:

xml.parseString('aaaa'), function(err, result) {

  if (err) {
    console.log(err);
    return;
  }

  // parse result
  console.log(result.field);

});

But you allow this to crash, while error handling is written:

xml.parseString(''), function(err, result) {

  if (err) {
    console.log(err);
    return;
  }

  // parse result
  console.log(result.field);

});

You are expecting people to write error handling twice, in the same place. That is not reasonable:

xml.parseString(''), function(err, result) {

  if (err) {
    console.log(err);
    return;
  }

  if (typeof(result) !== 'object') {
    console.log('result not object');
    return;
  }

  // parse result
  console.log(result.field);

});
Leonidas-from-XIV commented 11 months ago

In your code, you check that an error has occured and if not then you return. So if there is no error past that point, you can assume that the result is a valid object, no need to check that is is an object.

Unless you're reporting a case where there is no parsing error reported yet the result is not an object. In that case there should obviously be an error reported, and this is indeed an issue, but the fix is to report an error and not return {} as a result.

andrewhodel commented 11 months ago

@Leonidas-from-XIV that's not true, an empty string is parsed to an empty object.

Leonidas-from-XIV commented 11 months ago

That's probably not right, as an empty string is not valid XML so shouldn't parse to an empty object.

andrewhodel commented 11 months ago

You're not thinking, you are thinking one or the other not actually thinking.

It's not invalid XML, there's no error.

Think about it, they need to increment a counter because parsing was attempted now they must add logic to error handling and the non error handling.