Open Christopheraburns opened 5 years ago
@Christopheraburns I think the method of disambiguation (determining if you have two 9's or only one) depends on how the cards are presented to the camera. Is the camera present during the whole dealing of the cards, or just once (you get only one picture of the scene) ? In the first case, if the cards are dealt one by one and if a new card never completly cover the cards already dealt, as it is the case in blackjack I think (correct me if I'm wrong), it is easy to determine the new card by comparing the values and occurences of the cards detected in every frame. Let's say at time t0, we have 0 occurences of 9 of hearts. At time t1, the dealer put a 9 of hearts on the scene. We will detect 2 occurences. At time t2, another 9 of hearts is dealt. Now we detect 3 or 4 occurences, depending if the previous 9 of hearts is partially covered or not. This way, it is easy to maintain up-to-date the count of 9 of hearts. In the case where you just have one picture of the scene, maybe you can rely on some knowledge you have on how the cards are placed. For instance, if the cards are always roughly parallel to, let's say, the y-axis of the picture, then you just need to learn to detect the upper-left corners of the cards (or the lower-right corner, but not both). This way, for one card dealt, you detect only one occurence. Practically, it means you have to modify accordingly the scripts that generate the dataset. Lastly, if there is no such restriction, well, there is still some geometric constraints you can rely on to determine if 2 corners are from the same card or not. If there are from the same cards, you know that they are in opposite direction and at a fixed distance one from the other. You can determine the orientation of one corner by applying some "classical" computer vision techniques on the detected zone given by yolo (contour detection, rotated bounding box,...).
Hope it helps :-)
@geaxgx - first off, thanks for sharing this great work on building a dataset. It's really the key to a good model.
I have a trained model that detects suit and rank with excellent accuracy. Here is a sample image: https://s3.amazonaws.com/chris-misc/predictions.jpg
After seeing this image I realize that when I create an application to determine the cards in a black jack hand the application may not know if I have two 9's - each card having one corner obscured, or if I have a single 9 with both corners visible.
Is it possible to use some type of instance segmentation with Yolo or is this an instance where I train multiple models, one to extract the entire card and a second to extract the suit and rank?
TIA