Metron-Project / metroninfo

Digital Comc Book Metadata XML Schema
MIT License
11 stars 0 forks source link

Additional Series Information #38

Open bpepple opened 1 month ago

bpepple commented 1 month ago

Some additional Series information might be useful to include like:

  1. Series Status: Using values like "Ongoing", "Cancelled", "Complete", "Hiatus".
  2. Count: Issue count for a series.

This big downside to this is that these values change since they are at a series-level and comics within the same series could have different values.

majora2007 commented 2 weeks ago

We need something like Count or Series Status. Count currently is in the ComicInfo spec and while it's not great (as it isn't clear how to use in more than loose issues), it offers the ability to tell if a Series is completed or Ended publishing.

The current spec does not offer any of that ability and that will be a major pain point in the software. Users generally tag once when they start a series and once when the series concludes. This is not something that needs to be tagged per file. To my knowledge, Komga and Kavita both take the information for series from the first issue.

The lack of the field means in order to keep the same functionality (which to me is critical), these servers will have to integrate with upstream APIs and have this new dependency. This is unlikely to happen in Kavita and Komga.

Series Status having the ability to have meta state would be beneficial. So instead of Ongoing/Complete, having Cancelled, Hiatus offers information that ComicInfo is missing. Pair it with some sort of Count and server softwares have a really rich way of informing users of the state of the Series.

Buried-In-Code commented 2 weeks ago

The problem that I have with including Status and Count is that then every time a new issue in a series is released I need to retag the whole series to update the Count element. The Status is even worse as that can change at any point in time, so I'd need to be constantly checking with a service (e.g. Metron) to see if the status has changed, if I'm constantly doing this check why do I need to include it in a static file.

majora2007 commented 2 weeks ago

@Buried-In-Code why would you need to retag the whole series? But even then, Count would only be set once a Count is known. Count is the number of issues released. When a series is releasing, that number is not known. It's only known once the series is completed.

Self hosting servers only check the first file, so you'd only need to tag that one file. I'm bringing the perspective of self hosted servers and users that use them. It doesn't sound like you use software like Komga/Kavita/Codex/Stump.

bpepple commented 2 weeks ago

My gut instinct is not to include this, since it seems to be very specific to only software like Kavita, Komga, etc, but if it helps to make support for the new schema more palatable to those potential users I guess I'd be open to adding the additional fields.

I do see @Buried-In-Code point though, since I think those folks that don't use software like Kavita will likely be re-tagging their comics for these changes (since comic collectors tend to be fairly obsessive).

Maybe we can check with the ComicTagger, Codex, and Komga devs to see if they have an opinion?

ajslater commented 2 weeks ago

I like how Metron has a seriesType and publisherType to encapsulate stuff like this more logically.

ComicInfo.xml has the poorly named Count field. ComicBookInfo.json has both numberOfVolumes and numberOfIssues fields.

In comicbox/Codex I have issueCount and volumeCount to support these fields and they get attached to Series and Volume database columns.

I notice that metron does not have imprintType or volumeType. I personally would place issueCount in a volumeType, and volumeCount in the seriesType because it's similar to what I've done in my own metadata code.

I don't think you do need to retag all books in a volume or series to get accurate data if the reader always uses the maximum value for a volume or series found, which is what I do in Codex. When Solarpunk Adventures #001 comes out I update the series count to whatever is in the metadata. When Solarpunk Adventures #002 is imported later, if it's tagged with series count, I update the series in the database with max(series.volume_count, new_imported_volume_count) and similarly max(volume.issue_count, new_imported_issue_count).

Philosophy and model design aside, it's a nice to have bit of data but not at all crucial, certainly not to readers. I like to see an issueCount attribute, but not strongly. I'd also encourage what comicBookInfo does with both volumeCount and issueCount. If both of those went away forever my users would barely notice.

Series Status seems primarily useful for Mylar like programs who would be querying an API anyway. So i'd vote against it.

majora2007 commented 2 weeks ago

Kavita also counts the max then tries to match it to the highest volume or issue (since it supports both). I would also welcome a volume Count being added in addition to an issue count.

Can we not have these as optional fields that, while not used by Metron, can be used by people that want to use this metadata for Manga or non-American Comics? Or does every field in the spec need to match with Metron.cloud's metadata?

bpepple commented 1 week ago

Well, I haven't seen any other replies, so it seems having an IssuesCount element would be useful for comic servers and I'm willing to add that if it helps with adoption (even though I'm not a big of adding non-static information).

It seems like it should be a sub-element of the Series element, but should the new element be IssuesCount or something TotalIssues? Or something else?

majora2007 commented 1 week ago

Sub-element of Series I agree on. I'm indifferent to the name, they both work well to me.

ajslater commented 1 week ago

So when tagging from comicvine the tagger should query the issue counts for all volumes under the series and sum them to place in this series issue count field?

bpepple commented 1 week ago

So when tagging from comicvine the tagger should query the issue counts for all volumes under the series and sum them to place in this series issue count field?

How is it done currently with Comic-Tagger? Or does it not provide an issue count? I haven't run CT in ages, so I've zero clue how they do it with the CV API.

ajslater commented 1 week ago

In comictagger the issue_count is volume based. Probably because that's how ComicVine does it. So with some example comicvine data:

Series:
  volume_count: 2
  Volume 1:
    issue_count: 12
  Volume 2:
     issue_count: 10
     Issue #004

In comictagger, the issue_count for Volume 2, Issue #004 would be 10. Comictagger also does a volume_count for ComicLover's ComicBookInfo format, so that would be 2. But I've been told that format is being deprecated.

It sounds like with the schema you're proposing the MetronInfo Series.issueCount would properly be 22 in this example.

bpepple commented 1 week ago

Ok, just to make sure I understand you using some live data:

>>> from comicsdb.models import *
>>> series = Series.objects.filter(name__iexact="black lightning")
>>> series.count()
3
>>> for item in series:
...     print(f"{item.name} v{item.volume}: {item.issue_count} issues")
... 
Black Lightning v1: 11 issues
Black Lightning v2: 13 issues
Black Lightning v3: 1 issues

The way I see the IssueCount would be if you were to tag Black Lightning v1 #1 the Series element would look something like this:

<Series id="1897" lang="en">
    <Name>Black Lightning</Name>
    <SortName>Black Lightning</SortName>
    <Volume>1</Volume>
    <Format>Single Issue</Format>
   <StartYear>1977</StartYear>
   <IssueCount>11</IssueCount>
</Series>

Here we are given the total number of issues for this particular series, i.e. Black Lightning v1. It's possible I'm misunderstanding what you guys would need.

ajslater commented 1 week ago

To me it seems like if the element is part of the Series element it would logically relate to Series. So I might prefer something more like:

<Volume issueCount="11">1</Volume>

If this representation of Series is not meant to be a standalone representation of a Series but only occur as a subtag in an Issue's metadata, then the Series.IssueCount tag would be unambiguous because you won't ever have multiple Volume tags.

However, if that schema is meant to represent a standalone series then might either need a Volume Schema as well to represent multiple volumes within a series or use the issueCount Volume attribute suggested above.

Also, since you represented it above, <StartYear> is meant to represent the start of the Series? and Volume Start year is unrepresented, yeah? Sometimes volumes are 1 based and sometimes they're years which encodes that information.

bpepple commented 6 days ago

If this representation of Series is not meant to be a standalone representation of a Series but only occur as a subtag in an Issue's metadata, then the Series.IssueCount tag would be unambiguous because you won't ever have multiple Volume tags.

However, if that schema is meant to represent a standalone series then might either need a Volume Schema as well to represent multiple volumes within a series or use the issueCount Volume attribute suggested above.

Ok, I think I understand what you're saying.

Yes, the Series element represents the series for the individual issue the xml file is providing information, not as a representation of the the Series object (including it's various volumes).

Also, since you represented it above, <StartYear> is meant to represent the start of the Series? and Volume Start year is unrepresented, yeah? Sometimes volumes are 1 based and sometimes they're years which encodes that information.

Isn't the volume number as the Start Year a hack that was used as a workaround for Comic Vine? To the best of my knowledge I'm not aware of any series that uses a year in their indicia (not to say it's not possible, since we are talking about the comic industry).

ajslater commented 6 days ago

Yes, the Series element represents the series for the individual issue the xml file is providing information Got it. Good. Thanks.

Isn't the volume number as the Start Year a hack that was used as a workaround for Comic Vine?

I don't know. Like you said, I wouldn't expect consistency. I just looked at a title i'm familiar with: Wolverine's first limited series by Frank Miller in 1982 is tagged as Volume 1 on ComicVine. Wolverine's first ongoing series in 1988 used to be tagged as Volume 2, but is now also Volume 1 on ComicVine. The Wolverine volume that started in 2020 is Volume 6 on Comicvine.

IIRC, Marvel has inconsistently referred to serial volume numbers during the run of this and all of it's comics. But today they seem to have retroactively replaced volume numbers with year of first issue. Wolverine 1982, Wolverine 1988, Wolverine 2020, Wolverine 2024 and so on. Despite referring to "Vol" often int he past The Marvel website's current term for each volume is "Series".

I tend to see Series and Volumes as both having a bit of extra metadata associated with them. In XML this could be attributes or subtags depending on the importance of the metadata and personal preference. Publishers much less so and Imprints are rarely tagged at all. I prefer making potentially rich schemas for all those layers, but there's not really a practical case for Publishers or Imprints to be more than simple text elements.

Anyway, this is a digression. Volumes are years sometimes and not other times. I've never seen a Volume as a non-numeric string ever, but it wouldn't surprise me if one appeared.

The layout you have above looks to me like it implies are referring to so if they referred to Volume I'd prefer they'd either live as Volume attributes or subtags or be called . But if there was consistent convention that they always referred to Volume that could be documented and we could all work with it.

bpepple commented 5 days ago

The layout you have above looks to me like it implies are referring to so if they referred to Volume I'd prefer they'd either live as Volume attributes or subtags or be called . But if there was consistent convention that they always referred to Volume that could be documented and we could all work with it.

I don't have a strong opinion either way. What would you like for the Series element and sub-elements to look like?

ajslater commented 4 days ago

What springs to mind immediately is:

<Series id="1897" lang="en">
    <Name>Black Lightning</Name>
    <SortName>Black Lightning</SortName>
    <Volume>
        <Name>1</Name>
        <IssueCount>11</IssueCount>
        <StartYear>1977</StartYear>
    </Volume>
    <Format>Single Issue</Format>
   <StartYear>1977</StartYear>
</Series>

or

<Series id="1897" lang="en">
    <Name>Black Lightning</Name>
    <SortName>Black Lightning</SortName>
    <Volume issueCount="11" startYear="1977">1</Volume>
    <Format>Single Issue</Format>
</Series>

Tags vs attributes is a judgement call. But I'd be inclined to use <Volume><Name>1</Name></Volume> For consistency with Series.

bpepple commented 3 days ago

Truthfully, I like using the attributes more than your first example, but I'm fine with either.

One thing about your first example is that I think it would be more clear to the user if we used Number instead of Name, since I'm not aware of any volume containing alphanumeric characters (though it could be addressed by the documentation).

@majora2007: Do you have an opinion of @ajslater's series suggestions?

ajslater commented 3 days ago

Truthfully, I like using the attributes more than your first example, but I'm fine with either.

Sounds good.

One thing about your first example is that I think it would be more clear to the user if we used Number instead of Name, since I'm not aware of any volume containing alphanumeric characters (though it could be addressed by the documentation).

Yeah, I hear that. I use name because I have some l code that treats series, volume, imprint, and publisher abstractly. But in my comicbox tool I also coerce Volume name into an int. So Number is fine. I keep waiting for that int coercion decision to bite me, but it hasn't yet. It's probably fine.

bpepple commented 2 days ago

Alright. I'll write up a PR for this change (Option 1, unless someone expresses an interest in Option 2 before this afternoon when I create it).

majora2007 commented 2 days ago

There has been a lot of discussion. I'm about to head on a small holiday, so next week I'll give full comments.

bpepple commented 2 days ago

There has been a lot of discussion. I'm about to head on a small holiday, so next week I'll give full comments.

Ok, I'll hold off on creating a PR until you get a chance to comment. Have a nice weekend!