GregorCH / ipet

Interactive Performance Evaluation Tools for Optimization Software
MIT License
26 stars 6 forks source link

use substring instead of split to parse metafile #91

Closed alexhoen closed 3 years ago

alexhoen commented 3 years ago

I've metafiles where the datum can contain "@" - symbols. Currently parsing the StatisticReader doesn't support that, instead the program exit with an error.

In the current program the line in the metafile is split at @. So if there is a second @ in the line, the following assumptions aren't fulfilled. -> Instead of splitting (which currently is only done for removing the @) I would use substring to remove the @ and then obtain the datum and the attribute.

GregorCH commented 3 years ago

Yeah, that previous split was weird and even a bit slow, anyway. Could you please extend the current code, however, and throw an exception if the first character on the line is not an @-character? After all, this is the required format for a meta file.

And please be careful, however, about the assumption you make : In general, one should avoid that log files and the associated files like meta, err, etc. contain strings with arbitrary bulls*** in them. Then, the format is simply illegal and cannot be parsed. Why, for example, should a date contain an @ character? This character is reserved for email addresses, the Cor@l benchmark library, and IPET meta files.

IPET parses start and end date of a run from the @03 and @04 tags in the log files, if I remember correctly. These are integers and can be used for sorting.

alexhoen commented 3 years ago

I added unintentionally an @ symbol before the githash. (I removed that.) Nevertheless I think IPET shouldn't exit with a compile error if that happens, so would this be a possible fix. (assuming there following code can handle the @)

    """
    Read lines of the form
    @Key Value
    from meta, out and err file and stores 'Value' in a Field 'Key'.
    """

Since this function is also used to parse the .err and the .out file (see comment above), I think the exception wouldn't be correct.

GregorCH commented 3 years ago

I have never heard of a compile error in Python. What exactly do you mean by that? I have problems imagining a situation where this could fail.

Note that the MetaDataReader uses a regular expression that ensures that it only parses lines starting with an @ followed by at least 3 non whitespace characters. (Thereby, it bypasses @01, @02, @03, @04 in an out file).

GregorCH commented 3 years ago

Contemplating about what I just wrote, I don't think an exception makes any sense because the regular expression ensures that the exception cannot occur.

alexhoen commented 3 years ago

Contemplating about what I just wrote, I don't think an exception makes any sense because the regular expression ensures that the exception cannot occur.

that was what I'm trying to say but failed ;-)