Closed simonbowly closed 2 years ago
@mattmilten we are pretty close to done with refactoring in this PR :) There are some very minor behaviour changes which we've documented in the changelog. To try to avoid regressions, the previous code is still in the repo, and tests/test_regression.py
tests the refactored code against the v1.3.2 code (we should delete both of these before finalising).
When you have a chance to review could you let us know your thoughts? Probably a few things could still be tidied up but we're quite happy with the structure.
Thanks @maliheha for your great work on this so far, and for putting the below summary of the new structure:
src/grblogtools/parsers/
. Each parser has a main method parse()
that takes a single line as an input and returns True/False in case the line is matched by a pattern associated with the parser.get_summary()
and get_progress()
(if applicable) to return the parsed summary or the detailed progress.SingleLogParser
which is responsible for parsing one single log run. SingleLogParser
(see src/grblogtools/parsers/single_log.py
) has the main method parse()
which takes in a line as an input and returns True/False if the line is matched by some patterns. The SingleLogParser
has two internal variables: current_parser
and future_parsers
which keeps them updated as it sees more lines. It initializes by setting the current_parser
to the header_parser
and the future_parsers
to the list of remaining parsers. If a line is not matched by the current_parser
, the SingleLogParser
checks whether it should pass the parser to any of the remaining parsers or not. As soon as any of the parsers in the list of future_parsers
returns True, the current_parser
and the future_parsers
are updated.src/grblogtools/api.py
) has also a method named parse()
where it takes in strings of log file patterns as individual arguments and returns a ParseResult
object. The methods summary()
and progress()
of the ParseResult
class returns the summary dict and search progress information as before.ParseResult
class is parse()
that takes in the path to a single log file (the log file can include multiple runs) and uses SingleLogParser
objects to parse each log run. get_dataframe(logfiles, timelines=False, prettyparams=False)
. i.e. the API is unchanged other than dropping merged_logs
(these are handled automagically)api.py
module.tests/
where separate tests are written for individual parsers and api.test_refactor_regression.py
including regression tests to ensure the equivalent outputs from the current grblogtools api and the newly designed api. data
folder and newly added test data in tests/assets
.Excellent work! One question: would it make sense to define a base class for the different parsers? They all implement the same function anyway but the classes do not seem to be connected by a common base class.
Thanks! I don't think abstract base classes really add much here. There's a common API between classes but the abstract class doesn't add any functionality, so I'm inclined to just leave it as is.
@mattmilten this is ready to go!
@maliheha and I are working on this on my fork. These updates aims to separate the parsers for different sections of the log into submodules so things are easier to unit-test and modify as we go.
@mattmilten no need to review for now, we'll let you know when it's in a more complete state.
@ronaldvdv we noticed you're also adding some tests and example data - can we fold this into the refactored code? Just want to make sure we aren't duplicating work and writing tests that may eventually clash with one another. Happy to discuss and coordinate together.