anansi-project / comicinfo

ComicInfo.xml's new home
https://anansi-project.github.io/docs/category/comicinfo
MIT License
136 stars 8 forks source link

Document what the original ComicInfo format is #11

Open lordwelch opened 2 years ago

lordwelch commented 2 years ago

Currently there is nothing indicating where ComicRack left off and this project begins. Also I believe that these schemas (1.0 and 2.0) were not created by ComicRack as these are the only schemas that I've seen for ComicInfo files and everything that I've seen says that there were never any specifications published.

This looks like a great project, I look forward to seeing what evolves from this.

gotson commented 2 years ago

it seems to be agreed upon by at least a few actors that those schemas are as good as a start that can be!

It seems ComicRack is using v2.0, i suppose v1.0 was from an earlier version of ComicRack, but difficult to say with no source, and no developer around.

SenorSmartyPants commented 2 years ago

CRW method to pull all properties written to comicinfo.xml. System.Xml.Serialization base XMLSerializer routines used to write out the xml.

public ComicInfo GetInfo()
{
    ComicInfo ci2;
    using (ItemMonitor.Lock(this))
    {
        ComicInfo ci = new ComicInfo
        {
            Writer = this.Writer,
            Publisher = this.Publisher,
            Imprint = this.Imprint,
            Penciller = this.Penciller,
            Inker = this.Inker,
            Series = this.Series,
            Number = this.Number,
            Count = this.Count,
            AlternateSeries = this.AlternateSeries,
            AlternateNumber = this.AlternateNumber,
            AlternateCount = this.AlternateCount,
            SeriesGroup = this.SeriesGroup,
            StoryArc = this.StoryArc,
            Title = this.Title,
            Summary = this.Summary,
            Volume = this.Volume,
            Year = this.Year,
            Month = this.Month,
            Day = this.Day,
            Notes = this.Notes,
            Genre = this.Genre,
            Colorist = this.Colorist,
            Editor = this.Editor,
            Letterer = this.Letterer,
            CoverArtist = this.CoverArtist,
            Web = this.Web,
            PageCount = this.PageCount,
            LanguageISO = this.LanguageISO,
            BlackAndWhite = this.BlackAndWhite,
            Manga = this.Manga,
            Format = this.Format,
            AgeRating = this.AgeRating,
            Characters = this.Characters,
            Teams = this.Teams,
            Locations = this.Locations,
            ScanInformation = this.ScanInformation
        };
        this.Pages.ForEach(delegate(ComicPageInfo cpi)
        {
            ci.Pages.Add(cpi);
        });
        ci2 = ci;
    }
    return ci2;
}
SenorSmartyPants commented 2 years ago

ComicPageInfo xml attributes

        [XmlAttribute("Image")]
        public int ImageIndex

        [XmlAttribute]
        [DefaultValue(null)]
        public string Bookmark

        [DefaultValue(0)]
        [XmlAttribute("ImageSize")]
        public int ImageFileSize

        [DefaultValue(0)]
        [XmlAttribute]
        public int ImageWidth

        [DefaultValue(0)]
        [XmlAttribute]
        public int ImageHeight

        [DefaultValue(ImageRotation.None)]
        [XmlAttribute]
        public ImageRotation Rotation

        [DefaultValue(ComicPagePosition.Default)]
        [XmlAttribute]
        public ComicPagePosition PagePosition

        [DefaultValue(null)]
        [XmlAttribute]
        public string Key

        [XmlAttribute("Type")]
        [Browsable(false)]
        [DefaultValue("Story")]
        public string TypeSerialized
SenorSmartyPants commented 2 years ago
    public enum ImageRotation : byte
    {
        None,
        [Description("90°")]
        Rotate90,
        [Description("180°")]
        Rotate180,
        [Description("270°")]
        Rotate270
    }
    public enum ComicPagePosition : short
    {
        Default,
        Near,
        Far
    }
gotson commented 2 years ago

@SenorSmartyPants can you clarify what this is about ? I have literally no idea what this is or where this is from.

SenorSmartyPants commented 2 years ago

Decompiled code from the latest comicrack for windows.

lordwelch commented 2 years ago

My intent with this issue is more to document which versions were created by ComicRack vs this project. Right now it's not too difficult to figure out that the draft is what is being worked on by this project and the 1.0 and 2.0 versions were the originals but eventually 2.1, or whatever version number is decided on, will be published and then there isn't anything to go off of to determine what ComicRack made vs this project.

For most of the applications that I've seen, when they do support the ComicRack or ComicInfo.xml metadata, they don't specify a version but I believe they mostly support version 2. So I think it would be useful to make it clear where this project has made changes so that people can have some reference to set their expectations to when trying to figure out either what they should support if they are creating an application/plugin or if they are just a user trying to figure out what metadata will transfer between applications.

gotson commented 2 years ago

Regarding the supported version by most apps, v2 seems to be what's used, but it doesn't mean every element is supported, or not in the accepted way.

The 2.1 is also non breaking, meaning any 2.1 document could be processed by any application using the 2.0 schema, you would only be missing on a couple of elements.

I don't think this repo should be the place to reference which application uses which version. However I do agree we should make it as clear as possible where we picked up from ComicRack. Not sure how to write it on the website, but PRs are most welcome!

SenorSmartyPants commented 2 years ago

ComicRack Windows Release version history

Some useful notes about when certain fields were added, especially these:

Notably absent in that bugfix is Main Character (which I know is not saved to comicinfo.xml but should be) and Review (which I don't use and am not sure about if it gets saved.)

IngBertolini commented 2 years ago

Based on the ComicInfo.xml written by the last version of ComicRack, the original format should be the following

<xs:schema elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">
    <xs:element name="ComicInfo" nillable="true" type="ComicInfo" />
    <xs:complexType name="ComicInfo">
        <xs:sequence>
            <xs:element minOccurs="0" maxOccurs="1" default="" name="Title" type="xs:string" />
            <xs:element minOccurs="0" maxOccurs="1" default="" name="Series" type="xs:string" />
            <xs:element minOccurs="0" maxOccurs="1" default="" name="Number" type="xs:string" />
            <xs:element minOccurs="0" maxOccurs="1" default="-1" name="Count" type="xs:int" />
            <xs:element minOccurs="0" maxOccurs="1" default="-1" name="Volume" type="xs:int" />
            <xs:element minOccurs="0" maxOccurs="1" default="" name="AlternateSeries" type="xs:string" />
            <xs:element minOccurs="0" maxOccurs="1" default="" name="AlternateNumber" type="xs:string" />
            <xs:element minOccurs="0" maxOccurs="1" default="" name="StoryArc" type="xs:string" />
            <xs:element minOccurs="0" maxOccurs="1" default="" name="SeriesGroup" type="xs:string" />
            <xs:element minOccurs="0" maxOccurs="1" default="-1" name="AlternateCount" type="xs:int" />
            <xs:element minOccurs="0" maxOccurs="1" default="" name="Summary" type="xs:string" />
            <xs:element minOccurs="0" maxOccurs="1" default="" name="Notes" type="xs:string" />
            <xs:element minOccurs="0" maxOccurs="1" default="" name="Review" type="xs:string" />
            <xs:element minOccurs="0" maxOccurs="1" default="-1" name="Year" type="xs:int" />
            <xs:element minOccurs="0" maxOccurs="1" default="-1" name="Month" type="xs:int" />
            <xs:element minOccurs="0" maxOccurs="1" default="-1" name="Day" type="xs:int" />
            <xs:element minOccurs="0" maxOccurs="1" default="" name="Writer" type="xs:string" />
            <xs:element minOccurs="0" maxOccurs="1" default="" name="Penciller" type="xs:string" />
            <xs:element minOccurs="0" maxOccurs="1" default="" name="Inker" type="xs:string" />
            <xs:element minOccurs="0" maxOccurs="1" default="" name="Colorist" type="xs:string" />
            <xs:element minOccurs="0" maxOccurs="1" default="" name="Letterer" type="xs:string" />
            <xs:element minOccurs="0" maxOccurs="1" default="" name="CoverArtist" type="xs:string" />
            <xs:element minOccurs="0" maxOccurs="1" default="" name="Editor" type="xs:string" />
            <xs:element minOccurs="0" maxOccurs="1" default="" name="Publisher" type="xs:string" />
            <xs:element minOccurs="0" maxOccurs="1" default="" name="Imprint" type="xs:string" />
            <xs:element minOccurs="0" maxOccurs="1" default="" name="Genre" type="xs:string" />
            <xs:element minOccurs="0" maxOccurs="1" default="" name="Web" type="xs:string" />
            <xs:element minOccurs="0" maxOccurs="1" default="0" name="PageCount" type="xs:int" />
            <xs:element minOccurs="0" maxOccurs="1" default="" name="LanguageISO" type="xs:string" />
            <xs:element minOccurs="0" maxOccurs="1" default="" name="Format" type="xs:string" />
            <xs:element minOccurs="0" maxOccurs="1" default="Unknown" name="AgeRating" type="AgeRating" />
            <xs:element minOccurs="0" maxOccurs="1" default="Unknown" name="BlackAndWhite" type="YesNo" />
            <xs:element minOccurs="0" maxOccurs="1" default="Unknown" name="Manga" type="Manga" />
            <xs:element minOccurs="0" maxOccurs="1" default="" name="Characters" type="xs:string" />
            <xs:element minOccurs="0" maxOccurs="1" default="" name="Teams" type="xs:string" />
            <xs:element minOccurs="0" maxOccurs="1" default="" name="MainCharacterOrTeam" type="xs:string" />
            <xs:element minOccurs="0" maxOccurs="1" default="" name="Locations" type="xs:string" />
            <xs:element minOccurs="0" maxOccurs="1" name="CommunityRating" type="Rating" />
            <xs:element minOccurs="0" maxOccurs="1" default="" name="ScanInformation" type="xs:string" />
            <xs:element minOccurs="0" maxOccurs="1" name="Pages" type="ArrayOfComicPageInfo" />
        </xs:sequence>
    </xs:complexType>
    <xs:simpleType name="YesNo">
        <xs:restriction base="xs:string">
            <xs:enumeration value="Unknown" />
            <xs:enumeration value="No" />
            <xs:enumeration value="Yes" />
        </xs:restriction>
    </xs:simpleType>
    <xs:simpleType name="Manga">
        <xs:restriction base="xs:string">
            <xs:enumeration value="Unknown" />
            <xs:enumeration value="No" />
            <xs:enumeration value="Yes" />
            <xs:enumeration value="YesAndRightToLeft" />
        </xs:restriction>
    </xs:simpleType>
    <xs:simpleType name="Rating">
        <xs:restriction base="xs:decimal">
            <xs:minInclusive value="0"/>
            <xs:maxInclusive value="5"/>
            <xs:fractionDigits value="1"/>
        </xs:restriction>
    </xs:simpleType>
    <xs:simpleType name="AgeRating">
        <xs:restriction base="xs:string">
            <xs:enumeration value="Unknown" />
            <xs:enumeration value="Adults Only 18+" />
            <xs:enumeration value="Early Childhood" />
            <xs:enumeration value="Everyone" />
            <xs:enumeration value="Everyone 10+" />
            <xs:enumeration value="G" />
            <xs:enumeration value="Kids to Adults" />
            <xs:enumeration value="M" />
            <xs:enumeration value="MA15+" />
            <xs:enumeration value="Mature 17+" />
            <xs:enumeration value="PG" />
            <xs:enumeration value="R18+" />
            <xs:enumeration value="Rating Pending" />
            <xs:enumeration value="Teen" />
            <xs:enumeration value="X18+" />
        </xs:restriction>
    </xs:simpleType>
    <xs:complexType name="ArrayOfComicPageInfo">
        <xs:sequence>
            <xs:element minOccurs="0" maxOccurs="unbounded" name="Page" nillable="true" type="ComicPageInfo" />
        </xs:sequence>
    </xs:complexType>
    <xs:complexType name="ComicPageInfo">
        <xs:attribute name="Image" type="xs:int" use="required" />
        <xs:attribute default="" name="Bookmark" type="xs:string" />
        <xs:attribute default="0" name="ImageSize" type="xs:long" />
        <xs:attribute default="-1" name="ImageWidth" type="xs:int" />
        <xs:attribute default="-1" name="ImageHeight" type="xs:int" />
        <xs:attribute default="None" name="Rotation" type="ImageRotation" />
        <xs:attribute default="Default" name="PagePosition" type="ComicPagePosition" />
        <xs:attribute default="" name="Key" type="xs:string" />
        <xs:attribute default="Story" name="Type" type="ComicPageType" />
        <xs:attribute default="false" name="DoublePage" type="xs:boolean" />
    </xs:complexType>
    <xs:simpleType name="ImageRotation">
        <xs:restriction base="xs:string">
            <xs:enumeration value="None" />
            <xs:enumeration value="Rotate90" />
            <xs:enumeration value="Rotate180" />
            <xs:enumeration value="Rotate270" />
        </xs:restriction>
    </xs:simpleType>
    <xs:simpleType name="ComicPagePosition">
        <xs:restriction base="xs:string">
            <xs:enumeration value="Default" />
            <xs:enumeration value="Near" />
            <xs:enumeration value="Far" />
        </xs:restriction>
    </xs:simpleType>
    <xs:simpleType name="ComicPageType">
        <xs:list>
            <xs:simpleType>
                <xs:restriction base="xs:string">
                    <xs:enumeration value="FrontCover" />
                    <xs:enumeration value="InnerCover" />
                    <xs:enumeration value="Roundup" />
                    <xs:enumeration value="Story" />
                    <xs:enumeration value="Advertisement" />
                    <xs:enumeration value="Editorial" />
                    <xs:enumeration value="Letters" />
                    <xs:enumeration value="Preview" />
                    <xs:enumeration value="BackCover" />
                    <xs:enumeration value="Other" />
                    <xs:enumeration value="Deleted" />
                </xs:restriction>
            </xs:simpleType>
        </xs:list>
    </xs:simpleType>
</xs:schema>
IngBertolini commented 1 year ago

I added the missing attributes in ComicPageInfo based on @SenorSmartyPants comments

ferdnyc commented 1 year ago

@lordwelch

Currently there is nothing indicating where ComicRack left off and this project begins. Also I believe that these schemas (1.0 and 2.0) were not created by ComicRack as these are the only schemas that I've seen for ComicInfo files and everything that I've seen says that there were never any specifications published.

My intent with this issue is more to document which versions were created by ComicRack vs this project.

I definitely appreciate (and support!) your efforts to better delineate provenance/history regarding the ComicInfo formats, but it seems to me that these two aims are somewhat contradictory with each other.

As you say, ComicRack created none of the schemas; they never bothered to document or formalize their metadata at all. So, in a sense "everything" in the repo — all of the versions — were created by this project, none of them by ComicRack.

While schemas 1.0 and 2.0 may have been reverse-engineered from certain versions of ComicRack, without decompiling every release of ComicRack it's difficult to really say with any confidence:

  1. Which schema versions are supported by which ComicRack versions
  2. That either version is fully supported by any given ComicRack version
  3. That there aren't more versions beyond / in between 1.0 and 2.0, where ComicRack subtly/incrementally changed the format in ways not covered by the existing schemas

I think crediting ComicRack software for inspiring and initially implementing the format is important, and I think this project has done well in making sure to give that credit where it's due. But to say that they "created" anything here would be inaccurate, since ultimately all of the work here was merely built on the example set by ComicRack, not actually done by that project.

lordwelch commented 1 year ago

I definitely appreciate (and support!) your efforts to better delineate provenance/history regarding the ComicInfo formats, but it seems to me that these two aims are somewhat contradictory with each other.

I don't see how there are two different aims. Here I stated:

Currently there is nothing indicating where ComicRack left off and this project begins.

which can be argued that the readme techinically covers this:

Schemas are available in schema, each version in a separate directory.

Current drafts are available in drafts, each version in a separate directory.

But it doesn't not in any way that a new person would be able to come and look at it and see where the lines are drawn.

My next statement is:

My intent with this issue is more to document which versions were created by ComicRack vs this project.

Which is a direct restatement of by first statement and has no conflict with it. I put this is as it started to go down the decompiling route and finishing the v2.0 schema which is not what I intended this ticket to be about.

I'm kind of disappointed in the rest of your reply, as it's stated with confidence as if it is a fact without a shred of proof and much of it is provably wrong and to me seems to try to take away from the accomplishment of cYo (the single creator of ComicRack). He, as far as I can tell, single-handedly created, in his own words, "The best Comic Reader in the World" and the only enduring and actually used metadata format for comics. Now that ComicRack has been dead and gone for several years, guess what it's actually still around! There is a subreddit for ComicRack users https://www.reddit.com/r/comicrackusers/ that is still fairly active years after the website has gone down. So even while dead it is still the best. Kavita, Codex, ComicTagger (I'm the current maintainer of CT) and the others are all great projects but they don't come close to feature parity or, in my opinion, the professionalism that ComicRack had when it was published. Here is the manual for reference https://web.archive.org/web/20230101103419/https://sites.google.com/site/comicrackmanual/home

In any case this repo and the anansi project is about moving forward. So here is all of the information I've been able to pull together in the last 9ish hours or so. Feel free to do with it what you will.

@ferdnyc Turns out I was wrong. ComicRack did publish at least one version of the schema

Metadata schema posted 07 Jan 2008 21:19 https://web.archive.org/web/20161229183519/http://comicrack.cyolito.com:80/forum/7-general/709-using-comicinfo-xml-in-other-viewers#713

I've posted the schema definition in the Downloads/Support category. I know it is not pretty, but it is all there currently is :) . It is rather stable. no big changes in the last year (maybe the addition of some new element). As you can see it was designed to be compact, small and simple. No fancy attributes, just simple plain XML elements to keep parsing simple and the file very human readable.

The news archive shows that version 0.9.74 was released the same day https://web.archive.org/web/20090913033343/http://comicrack.cyolito.com:80/home/newsarchive?start=20

The internet archive shows that the last time that ComicInfoSchema.zip was updated was 29 Aug 2009. This may have simply been a re-upload but otherwise would include the first three changes in the below changelog. https://web.archive.org/web/20150525081311/http://comicrack.cyolito.com/downloads/ComicRack/Support-Files/

Here you can see someone locating v1.0 and v2.0 before this repository even existed https://www.reddit.com/r/comicrackusers/comments/ijaj77/anyone_have_a_copy_of_comicinfoschemazip/ It seems that vaemendis https://gist.github.com/vaemendis/9f3ed374f215532d12bda3e812a130e6 and Tom Galloway over on SourceForge https://sourceforge.net/p/comix/patches/37/ are the only people that re-uploaded the original. mrnejc found or created v2.0 here https://gist.github.com/mrnejc/1e6da859de493a333c3f45c153d8036c which sans whitespace is the same as the v2.0 originally uploaded to this repository. I do not know if this is the same place that gotson obtained his v2.0 but the gist was uploaded around the same time that he started working on the anansi-project (as evidenced by the creation dates on the other repos).

The precise changes that the v2.0 schema has are:

Specifically it does not include:

For reference the last published version is 0.9.178 You can get the full changelog by installing ComicRack from here https://web.archive.org/web/20181018173957/http://comicrack.cyolito.com/downloads and you will find it in "C:\Program Files\ComicRack\Changes.txt"

As for defaults for age rating, genre and format they are located in "C:\Program Files\ComicRack\DefaultLists.txt" In this post https://web.archive.org/web/20161229183519/http://comicrack.cyolito.com:80/forum/7-general/709-using-comicinfo-xml-in-other-viewers#852 cYo states that these are just defaults and asks for better ones and says you can edit them for your own uses. So for the purposes of this schema I think they should be considered free-form with any listed as nothing more than suggestions to start you off.

This changelog is only the metadata changes starting from when the first schema was published to the last metadata change. The dates are found by using the "Old Versions" section of the archived downloads section of the ComicRack website http://comicrack.cyolito.com/downloads/comicrack/ComicRack/Old-Versions/orderby%2C2/page%2C2/

Build 0.9.86: 13 July 2008

Build 0.9.89: 27 July 2008

Build 0.9.91: 01 August 2008

Build 0.9.100: 24 December 2008

Build 0.9.119: 05 Apr 2010

Build 0.9.122: 15 May 2010

Build 0.9.126: 24 Jul 2010 - 23 Aug 2010

Build 0.9.129: 12 Sep 2010

Build 0.9.130: 12 Sep 2010

Build 0.9.135: 01 Mar 2011

Build 0.9.141: 18 May 2011

Build 0.9.142: 17 Jul 2011

Build 0.9.151: 05 Feb 2012

Build 0.9.160: 27 Jan 2013

as an ending note I don't care much for decompiling a closed source program, except to personally learn about decompiling, but many have differing opinions see 1 and 2 if you are interested

t0815 commented 10 months ago

Hi there, tbh the format is kind of a mess. I'm currently developing a CBZ Archive Editor (Win_CBZ) which does implement the ComicInfo.xml schema. I did implement almost all of the stuff described in the schema. However, every tool is handling the metadata in a different way. I did notice, that most applications don't honor the PageIndex at all (ordering pages according to the index and recognizing which Page is supposed to be the FrontCover for example).

One more important field, that is missing from the current spec is the Tags field, a comma-separated list of tags. Komga and other tools are parsing this field to generate their tags.

The CBZ_Editor is currently in Beta btw, and can be downloaded and tested 😄 .

gotson commented 10 months ago

tbh the format is kind of a mess

This is an understatement 🤣

One more important field, that is missing from the current spec is the Tags field

It's here https://github.com/anansi-project/comicinfo/blob/fea1f5576eeb2fde406957b9cc880f438bb1e985/drafts/v2.1/ComicInfo.xsd#L30

t0815 commented 10 months ago

ah nice, guess i did miss it. I was just wondering, since i was pretty sure it was there before 😄