lichess-org / api

Lichess API documentation and examples
https://lichess.org/api
GNU Affero General Public License v3.0
429 stars 142 forks source link

`/api/games/user` – nondeterministic, returning duplicates, missing results #366

Open matthew-piziak opened 1 month ago

matthew-piziak commented 1 month ago

Endpoint under test:

curl https://lichess.org/api/games/user/Deltaway?rated=true&since=1709269200000&perfType=blitz -H "Authorization: <TOKEN_REDACTED>" -H "Accept: application/x-ndjson"

In restclient format, for clarity:

GET https://lichess.org/api/games/user/Deltaway?rated=true&since=1709269200000&perfType=blitz
Authorization: Bearer <TOKEN_REDACTED>
Accept: application/x-ndjson

I expect slightly over 500 games returned, but I get back 900–1500 objects (the number varies with each run). Many of these are duplicates, for example I got five of game I0iDZ1UZ returned in the results. And when I filter out duplicates on my end I'm missing some games that I ought to see based on the results of game search on the web UI.

It's possible I'm misunderstanding correct usage of the API—sorry if so!

matthew-piziak commented 1 month ago

By removing the perfType query argument and filtering the stream locally, the correct results are produced (currently 519 games).

Recently GameApiV2.scala was updated at this line, which turns on ElasticSearch when a perfKey argument is included in the query. The change deployed at approximately the time I noticed the breakage, so there might be a connection there.

khgiddon commented 1 month ago

+1: My app (Chess Stamps) using the /games API was broken at approximately the same time as the deploy that @matthew-piziak pointed out. All of my API calls are returning no results but also seem to be working correctly when excluding the perfKey argument.

matthew-piziak commented 1 month ago

Interesting @khgiddon, in my case I do get results but they're the wrong ones. Still, there's the common implication of perfKey which makes me think the secondary database is the cause.

Thankfully I don't have many games and my usecase is very simple. So my workaround is simple too. I just pull all my games regardless of perfKey and filter them myself.

(ns chess
    (:require
     [clojure.pprint :as pp]
     [clojure.string :as s]
     [clojure.java.io :as io]
     [clj-http.client :as client]
     [cheshire.core :as json]
     [clj-time.core :as t]
     [clj-time.coerce :as tc]))

(defn game->rating [game]
  (let [{{:keys [white black]} :players} game]
    (:rating (case (-> white :user :name) "Deltaway" white black))))

(defn game->date [game]
  (->> game :createdAt tc/from-long ((juxt t/year t/month t/day)) (apply t/local-date) str))

(defn game-desired? [game]
  (= "blitz" (:speed game)))

(def process-games
  (comp
   (map #(json/parse-string-strict % true))
   (filter game-desired?)
   (map #(str (game->date %) "," (game->rating %) "\n"))))

(defn download-ratings []
  (let [url (str "https://lichess.org/api/games/user/Deltaway?rated=true&since=1709269200000&sort=dateAsc")
        headers {"Authorization" "Bearer <TOKEN_REDACTED>" "Accept" "application/x-ndjson"}
        response (client/get url {:headers headers :as :reader})]
    (with-open [reader (:body response)
                writer (io/writer "ratings.csv")]
      (.write writer "date,rating\n")
      (doseq [line (sequence process-games (line-seq reader))] (.write writer line)))))

(download-ratings)