PokemonTCG / pokemon-tcg-data

The data found within the Pokémon TCG API
434 stars 202 forks source link

fix: bp-4 artist name #452

Open phiilu opened 1 year ago

phiilu commented 1 year ago

Interestingly enough on some cards, the name is K. Hoshiba and on others it is K Hoshiba. Not sure what is more "correct". I would keep it consistent across cards so that there are not two artists with the same name.

Cealgair commented 1 year ago

In my opinion "artist" should match what is written on the card. Since bp-4 and others have K Hoshiba I think the most "correct" approach would be to have "artist": "K Hoshiba" on those and "artist": "K. Hoshiba" on the others. I don't think this should be the final say without discussion, but it's consistent with what I've done in #362, which was merged

phiilu commented 1 year ago

Yes, I understand that the API should follow the physical cards as close as possible which also means it includes the human errors introduced irl, It just makes it harder to work with.

If I think with a database in mind there would be an artists table with an id and unique name columns. Without any manual checks or logic, it is hard to keep the data consistent and not save 2 artists which are meant as one. Quyering now for K Hoshiba would only return 1 card whereas by querying with K. Hoshiba will get the rest.

Without any additional information (an id) that those are the same artist this is quite a hard task to link them together.

Not sure how that hidden information (like the artist you removed or artists with multiple names) can be or should be added to the API. Strictly mapping the physical cards in the API might be okay, but also the API could maybe "correct" those errors while also including the original mistakes.

adback03 commented 1 year ago

Generally speaking, I have tried to match whatever the card shows. I think a way to fix this would be something like you said where I could introduce an artist as a completely new entity which has its own id. I could also simply handle this as part of the API...that is know these outliers and allow searches on both versions to return the same results.