ciaranm / glasgow-subgraph-solver

A solver for subgraph isomorphism problems, based upon a series of papers by subsets of McCreesh, Prosser, and Trimble.
MIT License
62 stars 23 forks source link

detect_format not working on csv if file has CRLF line endings #7

Closed Ishadijcks closed 3 years ago

Ishadijcks commented 3 years ago

I got the following error.

unable to auto-detect file format (no recognisable header found)
Maybe try specifying one of --format, --pattern-format, or --target-format?

Running it with --target-file=csv, the solver runs but messes up in printing the mapping. Not sure why.

// Expected
mapping = (A -> Orange) (C1 -> Fruit) (B -> Banana) (C -> Blackberry) (D -> Strawberry) (E -> Apple) (F -> Lime) (E1 -> Contains AN) (E6 -> Citrus) (E2 -> Starts with B) (E3 -> Berry) (E4 -> Red) (E5 -> Green) 

// Actual
) (E5 -> Greens with BBlackberry) (D -> Strawberry) (E -> Apple) (F -> Lime) (E1 -> Contains AN

Performing a diff with a working file showed me the only difference in the line endings. I haven't tested it for other formats.

For now I've changed my pre-processor to output LF line endings, and it works just fine.

Just thought I'd let you know, this a great tool, keep it up!

ciaranm commented 3 years ago

We use C++ standard filestreams for reading files. On a Unix platform, if you feed it a file with CRLF line endings, the CR will be treated as part of the input. My guess is that this means that the CR will end up as part of the vertex name for the second vertex on each line, which will cause all sorts of strangeness.

I'm not sure what the best way of handling this is. The standard library doesn't have a standard way of dealing with different line endings, but we could write our own I suppose. My gut feeling is that a better idea would be to throw an exception if a vertex with a "weird" name is encountered, which would at least give the user a clue that something strange is happening rather than outputting odd results.

Ishadijcks commented 3 years ago

to throw an exception if a vertex with a "weird" name is encountered, which would at least give the user a clue that something strange is happening rather than outputting odd results.

Agreed, this sounds like the easiest solution!