joaoftrodrigues / atpdb-cleanup-and-conversion-to-relational

0 stars 0 forks source link

GameRank correct values #15

Closed inescelorico closed 1 year ago

inescelorico commented 1 year ago

GAMERANK - what are the string types???

db.atpplayers.aggregate([{"$project":{ _id:0, "fieldType":{ "$type":"$GameRank"}}},{"$group":{ _id:{"fieldType":"$fieldType"},count:{$sum: 1}}}])

{ _id: { fieldType: 'int' }, count: 1189780 } { _id: { fieldType: 'string' }, count: 119055 }

1. Find strings GameRank

db.atpplayers.find({"GameRank":{"$type":2}},{"_id":0,"GameRank":1})

db.atpplayers.find({"$and":[{"GameRank":{$ne:"-"}},{"GameRank":{"$type":2}}]},{"_id":0,"GameRank":1})

db.atpplayers.find({db.atpplayers.find({"$and":[{"GameRank":{$ne:"-"}},{"GameRank":{"$type":2}},{"GameRank":{$ne:"-"}}]},{"_id":0,"GameRank":1})

2. Count string "" e "-" individually

db.atpplayers.find({"GameRank":""}).count() 99819

db.atpplayers.find({"GameRank":"-"}).count() 19236

119055 (string) = 99819 ("") + 19236 ("-")

inescelorico commented 1 year ago

Solution --> turn into null values

db.atpplayers.updateMany({"GameRank":"-"},{$set:{"GameRank":null}})
db.atpplayers.updateMany({"GameRank":""},{$set:{"GameRank":null}})