logpai / logparser

A machine learning toolkit for log parsing [ICSE'19, DSN'16]
Other
1.56k stars 551 forks source link

Added cleanData() and modified Spell_demo #76

Closed axwitech closed 1 year ago

axwitech commented 2 years ago

In Spell_Demo you can specify a list of regex_remove expressions (just like you do with the preprocessing). This list will be used to pre clean the data before the Spell algorithm starts. For example, pytest logs can contain strings such as "-----" or multiple ":" characters. These seem to confuse Spell and it treats some of them as parameters therefore leading to real parameters not being parsed.

I think this might be useful if developers know already some "noise" patterns that appear in the logs and want to remove them