Closed Zachu closed 2 years ago
It certainly could be added. I am simply curious as to how it would be used? In other words, the purpose of this repository is displaying card images. Are you using the repo for other purposes? If I would implement it, I would be tempted to do something like:
{ "name": "frigid apparition", "points": 247, "expansion": "Gloomhaven", "level": "X", "card number": 158, "image": "character-ability-cards/gloomhaven/MT/gh-frigid-apparition.png", "xws": "frigidapparition", },
The reason for this format would be to standardize it as something to be used (specifically card number) across all types of cards.
It could definitely be something like that too. I thought the "data"-section would be a unified place to store data that's actually printed on the card. Basically the name would then fit into that section too. But this was just something I initially thought and not something I put lots of thought effort into.
My actual use case was to just simply "show me Mind Thief's level1 (and X) cards" so that I would not get spoiled on the rest of the abilities I would unlock later on.
But considering the README's first sentence:
An easy-to-use collection of data and images...
it doesn't actually state what this can and should be used for. And that's great! I think this project has an opportunity to lay out a foundation for a lots of projects apart for just showing you the images.
I immediately also thought that under data/character-mats.js there could be for example a number stating the hand size and max hp for each level of the character because that is stated in that entity. Therefore one could use this an authorative machine-readable data source for *Haven facts as well as place where the graphics can be found. But I guess this is quite an ambition =)
Btw I would gladly do the work myself and send a pull request :) I can try something tomorrow now that I know you're open to the idea
I would also like to have more data (in particular the level of character cards) in the JSON file. For ability cards the initiative would also be a sensible datum to add here.
Here is an outline of a project I would do, if this data was available:
Just last weekend I wanted to create "character overviews" with all the ability cards of each character as a pdf (to annotate it with my thoughts on the cards using a tablet). My idea was to produce the pdf with LaTeX and write a small Python program to generate the .tex
code based on the JSON data. For such an overview sorting by level is the only sensible ordering.
Btw I would gladly do the work myself and send a pull request :) I can try something tomorrow now that I know you're open to the idea
I could also help adding the data. BTW: the index.html
has listings by level -- one could try to extract them. Another source of machine readable card data could also be Gloomhaven Digital (I haven't checked so far).
For ability cards the initiative would also be a sensible datum to add here.
I completely agree and actually when doing some sort of a proof of concept for the data extraction from the images I already did try to scan the initiative value also.
I could also help adding the data. BTW: the index.html has listings by level -- one could try to extract them. Another source of machine readable card data could also be Gloomhaven Digital (I haven't checked so far).
Nice! I had not noticed that. I guess should still try the tesseract way if we want other values be spotted from the images, unless someone can find better source for the card data. One other source for, for example the initiative value, could be the TTS Gloomhaven Fantasy Setup mod which has "sort hand by initiative". But I don't know if that's open source either.
I'm currently trying to teach a tesseract model the fonts used in Gloomhaven because it wasn't that accurate with the default model. Levels could be extracted with good accuracy but the initiative value often failed. It took me a few whiles to learn how this tesseract thing works but let's see if I get something useful out of it =)
I want to make something clear ... I am not opposed to making modifications to the JSON to provide more information about any of the cards that it currently supports, including the Character Ability Cards UNLESS such modifications would prevent the WAV from functioning properly based on its original purpose.
To that end, I will have to vet all pull requests, and do thorough testing BEFORE anything is rolled into production.
I have currently existing pretty much working setup with only a few values that has to be put there manually because the OCR is unable to read them. I'll see a bit later when I can do a PR for this, and then refine the manual work values there as well.
My output currently looks something like this, which of course has to be combined into the data sources then:
{"image":"character-ability-cards/gloomhaven/TI/gh-tinkerers-tools.png","initiative":26,"level":"3","number":47}
If something fails to read correctly it says "null". The level
field being string "null"
currently, but anwyway =) It should be quite easy to go through.
{"image":"character-ability-cards/gloomhaven/EL/gh-ice-spikes.png","initiative":null,"level":"1","number":455}
What I noticed during this is that the "BS" class has different padding in it's cards so I have to do something about that as well =) And currently I've only tried the Gloomhaven ability cards.
Still not pull-request worthy but here's an update at least. All the Gloomhaven -cards have been parsed now in my fork of this repository, Zachu/worldhaven. but I've not yet filled the ones that the OCR reader couldn't find the correct values. That's something I could use some help with but maybe a bit later when I've otherwise done fiddling with it by scripts. Anyways examples can be seen in here https://github.com/Zachu/worldhaven/blob/master/data/character-ability-cards.js#L4559-L4568.
I plan to read through the Crimson Scales, Forgotten Circles, and Jaws of the Lion cards through as well before considering the automation part being done. I intend to leave the Frosthaven ones out because they don't seem to be... a bit different. I just need to fix some positioning and stuff for the other games.
Few questions though, perhaps for @any2cards mostly but others can also comment on it.
image
key. Meaning that the first item in the data file would be aa-back
instead of be-back
. Do you see this as an issue? initiative
, level
, and number
be, and should we put them under some key telling us that the values are read from the actual image?Here's my personal opinions on these but I consider @any2cards having the last call on these.
data
key but actually now thinking of it more I'd actually choose something like values
where I'd put the initiative
, level
, and number
. We could follow similar pattern of values
in another data files as well if we end up reading data out of those as well.To be clear, there are reasons for the very specific order that data is within each of the JSON files (in terms of what shows first, second, etc.). I do not want to change that order. This is true for the order of individual entries, as well as for the subentries for each name entry.
Ideally, each new key would be entered just above the "xws" key which is the last key for each entry. All things being equal, each additional item you want information for would be its own key. So for example, the Brute's "Balanced Measure" could go from:
{ "name": "balanced measure", "points": 31, "expansion": "Gloomhaven", "image": "character-ability-cards/gloomhaven/BR/gh-balanced-measure.png", "xws": "balancedmeasure" },
To:
{ "name": "balanced measure", "points": 31, "expansion": "Gloomhaven", "image": "character-ability-cards/gloomhaven/BR/gh-balanced-measure.png", "initiative": "77", "level": "X", "cardno": "012", "xws": "balancedmeasure" },
I know it may seem like I am being a bit of a pain in the (*& when it comes to specifying how it would look, but there are very good reasons, as there are a whole bunch of programs and tools I would have to change if we alter the format too much. So, from a very selfish standpoint, this is what would work best for me.
I know it may seem like I am being a bit of a pain in the (*& when it comes to specifying how it would look, but there are very good reasons, as there are a whole bunch of programs and tools I would have to change if we alter the format too much. So, from a very selfish standpoint, this is what would work best for me.
Nah, it's your project and you have all the power in the world to define the boundaries for others to work within. And if they can't then they can fork off :)
To be clear, there are reasons for the very specific order that data is within each of the JSON files (in terms of what shows first, second, etc.). I do not want to change that order.
Bummer... Well this should be achievable with little work.
Ideally, each new key would be entered just above the "xws" key which is the last key for each entry.
This is a bit of PITA since as the JSON specification says, "An object is an unordered set of name/value pairs" which means that the keys are not in any order/order shouldn't matter. I have to see how many hoops it requires me to jump through though :)
All things being equal, each additional item you want information for would be its own key.
I'm OK with this one. Flat structure is completely fine.
Finally, we have always the option to have the card data in a separate project as well so then it would not fiddle with wherever Worldhaven is being used.
So I spent some time thinking about this, and looking at my various tools that make use of the repository. I went ahead and made some changes, to make things easier for this implementation. You can place your stuff anywhere, so if I am assuming correctly, it would be easiest to place it at the end of an entry, after "xws", that is fine. Just remember you will have to add a comma after "xws". In addition, I have made it so entries don't have to be flat. So if you would prefer something like:
"name": "balanced measure", "points": 31, "expansion": "Gloomhaven", "image": "character-ability-cards/gloomhaven/BR/gh-balanced-measure.png", "initiative": "77", "level": "X", "cardno": "012", "xws": "balancedmeasure", "values": { "initiative": "77", "level": "X", "cardno": "012" }
this would be fine,. Note that I think all of these including initiative, cardno, etc. should be in quotes (as character strings) rather than numbers, as there are situations with both of which I am aware that cannot be pure numbers.
In addition, I personally don't think "values" or "data" are a good key name; but for the life of me, I haven't been able to think of a better one at the moment.
I don't mind the values being on the flat level with everything else and I also can't think of a better key there so let's go with flat structure then :)
I went ahead and made some changes, to make things easier for this implementation. You can place your stuff anywhere, so if I am assuming correctly, it would be easiest to place it at the end of an entry, after "xws", that is fine.
Nice, thank you! Yeah the keys currently go after xws basically because I'm merging the output the OCR gets with jq
into the existing ones. And therefore the new keys just gets appended into the old objects.
Just remember you will have to add a comma after "xws"
Thanks :) I'm mainly using jq
and I let it handle the correct rendering of the json. But better safe than sorry!
Note that I think all of these including initiative, cardno, etc. should be in quotes (as character strings) rather than numbers, as there are situations with both of which I am aware that cannot be pure numbers.
Only being played one scenario of Gloomhaven right now I cant argue back at all 😅 With initiative I think that sounds plausible. For card numbers that sounds weird, but I can cast them into strings. I don't have any issue with that. But I guess Crimson or other addons could prefix their cards like that.
First of all, thank you for the work you put into this already @Zachu!
Note that I think all of these including initiative, cardno, etc. should be in quotes (as character strings) rather than numbers, as there are situations with both of which I am aware that cannot be pure numbers.
Only being played one scenario of Gloomhaven right now I cant argue back at all 😅 With initiative I think that sounds plausible. For card numbers that sounds weird, but I can cast them into strings. I don't have any issue with that. But I guess Crimson or other addons could prefix their cards like that.
A few remarks on "numbers vs strings":
For initiative actual numbers generally make sense, as doing computations with them is sensible (for example computing the mean value). However, there is at least one character (in Frosthaven), where the initiative consists of something, which is not a number. In that particular case one could turn this into a number without loss of data, but we don't know what else Frosthaven might throw at us, so going for a string instead of an integer might be the safer choice.
For the level strings make more sense, mainly because of the X-cards (but there might be other special cases as well). From a user perspective it is also not much additional hassle to setup a sorting by level when they are strings instead of numbers (where one could define, say, X = 1.5
).
Strings for the card numbers are the right choice imho, because arithmetic on card numbers is not sensible and prefixes might occur in the future.
Considering the amount of data this repo contains, one could consider defining a JSON schema to validate the files. This can become quite handy in finding faulty entries – for example in the cases, where the OCR is not right. I will fiddle around with this a bit in the coming days (there is some code I can recycle from a project I used JSON schema in) 😉
I actually can agree with all of your points about the strings. I go with the string route with all of them. Thank you!
Considering the amount of data this repo contains, one could consider defining a JSON schema to validate the files. This can become quite handy in finding faulty entries – for example in the cases, where the OCR is not right. I will fiddle around with this a bit in the coming days (there is some code I can recycle from a project I used JSON schema in) 😉
Yeah defining a schema does sound a good idea. In my scripts I'm not blindly taking arbitrary output from the OCR so that shouldn't be an issue, but I might have bugs and whatnot in the scripts that the schema would reveal. And who knows what other ways there are in the future of appending stuff in there.
Starting to get the accuracy of the tesseract model, and the overall tooling pretty much there. Still have some problems that the model can't tell 6 and 0 apart from each other :sweat_smile: I'll try to fix that now.
I have updated character-ability-cards.js to include the following Meta information: Level, Initiative, and Card #.
Out of curiosity, what did you use as source for that data? Did you build on top of my OCRing PR or did you find another source of truth?
Lol. To be honest, the answer is funny. I had another request from a good friend to add this same information. It came two days ago. He has helped me an enormous amount in the past, so I was willing to do whatever to help. Since I did not know how long your efforts would take, I simply wrote some code to add all the lines with a value of "-" (which for example the card backs still retain), and then I manually entered all of the data.
Now, your efforts are not a waste. Perhaps you can double check my work by generating your own file, and we can diff the two and see if my manual efforts are accurate.
Since this has been added, I am closing this issue.
Would ability card levels and perhaps even the card number fit into the scope of this project?
What I was thinking is that the https://github.com/any2cards/worldhaven/blob/master/data/character-ability-cards.js file would contain something like
I think these could be extracted with
tesseract
or something similar if not already available in some format somewhere else in machine readable format.