Closed lordwelch closed 7 months ago
I can't thank you enough for these test cases. I changed a couple big things about the philosophy of the parser because of these. One being that I no longer divide tokens by the -
character.
Most of these test cases now pass with version 0.2.0
Maybe two of the test cases pass with slightly different dict data than you provided due to small differences of opinion. Two of the test cases I elected not to fix, again due to a difference of opinion:
WONFIX = {
# Leading issue number is usually an alternate sequence number
# WONTFIX: Series names may begin with numerals.
"52 action comics #2024.cbz": {
"ext": "cbz",
"issue": "2024",
"series": "action comics",
"alternate": "52",
},
# Only the issue number. CT ensures that the series always has a value if possible
# WONTFIX: I don't think making the series the same as the number is valuable.
"#52.cbz": {
"ext": "cbz",
"issue": "52",
"series": "52",
},
}
I am open to new ideas and opinions about how this works, so if you feel a way about any of this feel free to pipe up. comicfn2dict is primarly used by comicbox which is similar to comictagger, in that it manually tags comics and reads a variety of comic tag formats, but doesn't do any of the really useful or difficult stuff with online comic databases and identification, or have a nice gui.
I've included some of the test cases from ComicTagger here that this project doesn't handle the same way, these are a bit opinionated so not all of them necessarily need to be "fixed". I put in comments explaining how or why CT handles most of them, I also left out the scan_info/remainders as CT and this project have different cleanup strategies for them.
Also note that CT (with the complicated parser) parses all of these the same whether the
#
is there or not (except for the first one, that one it doesn't find the issue number)