artasparks / glift

Go Lightweight Frontend
MIT License
115 stars 33 forks source link

Handle Pandanet IGS SGF files with "Copyright" info #101

Closed varenius closed 9 years ago

varenius commented 9 years ago

Recent files from the Pandanet IGS website from the European Team Championships (http://pandanet-igs.com/communities/euroteamchamps/rounds/226) contain the following info in the SGF: CoPyright[ Copyright (c) PANDANET Inc. 2014 Permission to reproduce this game is given, provided proper credit is given. No warrantee, implied or explicit, is understood. Use of this game is an understanding and agreement of this notice. ] This makes Glift fail with this console error: "Error: SGF Parsing Error: At line [3], column [2], char [o], Unexpected character in property name"

I guess this could be handled in a nice way by glift (note the capital P in the pandanet file, which is probably also a mistake, so that any "copyright"-key should be compared by lowercase or so when reading this in glift.

artasparks commented 9 years ago

That's unfortunate. CoPyright is a mistake -- properties must be all caps according to the spec. I don't think I want to allow the parser to allow malformed SGFs.

However, I think the way to natively support IGS files would be to add a transformer option does some pre-processing over the file before the SGF is displayed. In this case, it could be to just convert CoPyright to COPYRIGHT. I've been meaning to use similar functionality to display tygem .gib files, since Glift already has a .gib parser.

varenius commented 9 years ago

Aha. I guessed the problem was due to IGS not following the formats, and I agree the parser should expect reasonable input. Transformer option could be nice, until IGS changes so that their files are written properly, which I hope will happen at some point.

Perhaps Glift could just ignore any keywords (i.e. before the SGF data) that it doesn't understand? Seems like a quite fallback to ignore these kind of things.

Eskil

artasparks commented 9 years ago

Ok, in 1.0.3 there will be a new parseType option in sgfOptions. Now there's:

yewang commented 9 years ago

I know this already a closed issue, but I thought I should point out that lowercase letters were allowed in property identifiers for FF[1]-FF[3], and only FF[4] forbids lowercase letters. Since Pandanet's SGF files omit the FF property, I suppose the FF[1] default applies, which technically seems to mean that their files are valid and just following an older standard.

For these older file formats, lowercase letters in property identifiers were simply ignored, and only the remaining uppercase letters would be interpreted. Hence, CoPyright => CP White => W thisaBlackmove => B

The guide for converting older file formats discusses this issue: http://www.red-bean.com/sgf/converting.html

Accounting for this would be necessary for dealing with older, but valid, file formats.