ndokter / dsmr_parser

Library to parse Dutch Smart Meter Requirements (DSMR) telegrams.
MIT License
110 stars 64 forks source link

Fix parsing tests and line start matching #132

Closed dupondje closed 1 year ago

dupondje commented 1 year ago

This pull req fixes 2 things:

First of all we seem to handle the exception in the code for parsing errors and just log a message. So tests never see parsing errors. Which is not what we want :) So we raise the exception triggered by the parsing.

Next to that, we enable multiline flag on the regex, so it creates a new line after every \n. This allows us to use line start (^) in the regexes, so we don't match things in a middle somewhere. (Fixes #128)

lowdef commented 1 year ago

There was a reason we catch the parse exception and only log it. It makes sure that people that have equipment that have single offending lines in their telegrams can still use the parser. It solved issues some people had. So I do not think it a good idea to change this.

If this is solely for testing then we need to find another test strategy.

(Basically this is undoing pull request #74 )

ndokter commented 1 year ago

Maybe the TelegramParser should raise an error and the clients like serialreader should determine what to do with it? So move the error logging to all clients?

dupondje commented 1 year ago

The question here is: is it a good idea to just ignore parsing errors and don't pass anything to the application using the dsmr_parser lib? Cause dsmr_parser is really a lib, so I think the application should always have full control over everything, so also handle exceptions from data it passed to it. Cause now the client/app that uses dsmr, thinks everything is just all fine while there are exceptions...

PR #74 might be the wrong solution? The application that calls the .parse() function should handle the exception imo. Just like stated in the code: https://github.com/ndokter/dsmr_parser/blob/master/dsmr_parser/parsers.py#L46 , the function can return exceptions, so handle them correctly and don't expect the lib to handle them.

But that's just my opinion :)

dupondje commented 1 year ago

Just rebased :)

lowdef commented 1 year ago

The question is: if a single line does not parse, does that render all other lines invalid? The answer is no, as we saw in the practical examples. As the exception breaks the parsing loop, once the exception is propagated upwards there is no chance to know what the other lines parse into.

So I still think that #74 is an elegant solution to make the parser more robust for flawed equipment out there. Maybe we can control the behaviour with a flag, like we do for CRC checksum_support? Then for tests we can set the flag telegram_specification['propagate_parse_exceptions '] to true, and in the wild we can set telegram_specification['propagate_parse_exceptions '] to false (default).

It satisifies both use cases and we do not need extensive changes on the client side.

dupondje commented 1 year ago

I changed the code a bit. Now it will only raise the exception when parse is called with the throw_ex option.

This is what we then use in the tests to make sure everything is parsed correctly.

While I still think the best way would be to let the application that uses dsmr_parser to handle the exception, you might want (like @lowdef mentioned) the parsing to continue to just ignore the failed line.

The cleanest way would be to have parse return the value + the possible exception/None, but then all clients/apps using dsmr_parser would need to get updated, which we won't do :) So I think this is for now the cleanest solution ...

ndokter commented 1 year ago

If its just for unit testing, why not mock the logger and verify that way?

@mock.patch('dsmr_parser.parsers.logger')
def test(self, logger_mock):
    logger_mock.assert_called_once_with('the exact error string')
    # or alternatively something with logger_mock.call_args_list
dupondje commented 1 year ago

If its just for unit testing, why not mock the logger and verify that way?

@mock.patch('dsmr_parser.parsers.logger')
def test(self, logger_mock):
    logger_mock.assert_called_once_with('the exact error string')
    # or alternatively something with logger_mock.call_args_list

Would be an option also. But maybe somebody wants to catch exceptions in its application also in the future? I think both are valid options :)

ndokter commented 1 year ago

If its just for unit testing, why not mock the logger and verify that way?

@mock.patch('dsmr_parser.parsers.logger')
def test(self, logger_mock):
    logger_mock.assert_called_once_with('the exact error string')
    # or alternatively something with logger_mock.call_args_list

Would be an option also. But maybe somebody wants to catch exceptions in its application also in the future? I think both are valid options :)

For me the current solution (throw_ex argument) is fine. What about you @lowdef?

lowdef commented 1 year ago

Yes fine with me too!

lowdef commented 1 year ago

One question @dupondje; @ndokter : While we are at that specific fragment.

Would it not be better to catch only specificly the ParseError exception, for all other exceptions need to be propagated always as we can not know how to recover from low level system exceptions for example.

            for match in matches:
                try:
                    dsmr_object = parser.parse(match)
                except ParseError:
                    logger.error("ignore line with signature {}, because parsing failed.".format(signature),
                                 exc_info=True)
                    if throw_ex:
                        raise
                except Exception as err:
                    logger.error(f"Unexpected {err=}, {type(err)=}")
                    raise
                else:
                    telegram.add(obis_reference=signature, dsmr_object=dsmr_object)
dupondje commented 1 year ago

@lowdef : Agree!

I adjusted the commit. Just made some change to the output as we nowhere use the f"" formatter but always .format() in other places in the code :) Just to keep the same everywhere.

lowdef commented 1 year ago

yes, perfect. I assumed you would clean up, it was just for illustration what i ment. I copied a fragment from the python docs.

ndokter commented 1 year ago

Thanks everyone for your work, i merged it and will make a new release soon

lowdef commented 1 year ago

Interesting, didn't know that. Thanks for sharing.

Nigel Dokter @.***> schrieb am Fr., 14. Apr. 2023, 16:49:

@.**** commented on this pull request.

In dsmr_parser/parsers.py https://github.com/ndokter/dsmr_parser/pull/132#discussion_r1166933801:

                 logger.error("ignore line with signature {}, because parsing failed.".format(signature),

exc_info=True)

  • if throw_ex:
  • raise
  • except Exception as err:
  • logger.error("Unexpected {}: {}".format(type(err), err))

No need to change, but in general when using the logging module i pass the variables as arguments:

logger.error("Unexpected %s: %s", err, err)

Here is a nice blogpost about it https://www.palkeo.com/en/blog/python-logging.html#use-a-format-string-with-arguments

In short: when doing things with the logs (for example Sentry), it allows for better grouping etc

— Reply to this email directly, view it on GitHub https://github.com/ndokter/dsmr_parser/pull/132#pullrequestreview-1385613707, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACCJDWAIQNQ5D6NTRAL7JRDXBFPZNANCNFSM6AAAAAAW3J7P2M . You are receiving this because you were mentioned.Message ID: @.***>

dupondje commented 1 year ago

@ndokter : Any chance to push a release soon? :) I noticed the previous release is not published on pip neither: https://pypi.org/project/dsmr-parser/

ndokter commented 1 year ago

@ndokter : Any chance to push a release soon? :) I noticed the previous release is not published on pip neither: https://pypi.org/project/dsmr-parser/

Thank you for the reminder. I pushed it under v1.2.3! https://pypi.org/project/dsmr-parser/