mattn / go-mastodon

mastodon client for golang
MIT License
600 stars 85 forks source link

Question on pagination #132

Closed carbontwelve closed 4 years ago

carbontwelve commented 4 years ago

I am attempting to use this library to write a bot that posts markov generated sentences based upon a users history. Below is a snippet of the code I am using to download an accounts history:

pg := mastodon.Pagination{}
last, err := s.db.GetLastStatusIdForAccount(*account)
fmt.Println(fmt.Sprintf("Continuing from: %s", last))

if err != nil {
    return cli.Exit(err, 1)
}

if last != "-1" {
    pg.MinID = last
}

page := 1
for {
    ss, err := s.client.GetAccountStatuses(context.Background(), account.ID, &pg)
    if err != nil {
        return cli.Exit(err, 1)
    }

    err = s.db.InsertStatuses(ss)
    if err != nil {
        return cli.Exit(err, 1)
    }

    fmt.Println(fmt.Sprintf("%d, Messages Found: %d", page, len(ss)))

    if pg.MaxID == "" {
        break
    }
    pg.SinceID = ""
    pg.MinID = ""
    time.Sleep(3 * time.Second)
    page++
}

By setting the MinID to an empty string in each iteration it seems to go from the most recent page backwards; my account has about 6000 statuses but this seems to get to 70 pages of 20 results plus a final page of 6 making for a total of about 1,300 statuses.

Is there something I am doing wrong or does Mastodon/Pleroma limit how far back in history we can go with the API?

178inaba commented 4 years ago

@carbontwelve I tried to get the status of the bot account in mastodon.cloud. https://mastodon.cloud/web/accounts/556729

package main

import (
    "context"
    "fmt"
    "log"

    "github.com/mattn/go-mastodon"
)

func main() {
    c := mastodon.NewClient(&mastodon.Config{
        Server:       "https://mastodon.cloud",
        ClientID:     "foofoo",
        ClientSecret: "barbar",
    })

    ctx := context.Background()

    if err := c.Authenticate(ctx, "email", "password"); err != nil {
        log.Fatal(err)
    }

    pg := mastodon.Pagination{}
    page := 1
    for {
        ss, err := c.GetAccountStatuses(context.Background(), "556729", &pg)
        if err != nil {
            log.Fatal(err)
        }

        fmt.Printf("%d, Messages Found: %d\n", page, len(ss))
        for _, s := range ss {
            fmt.Printf("%v\n", s.ID)
        }
        fmt.Printf("%+v\n", pg)

        if pg.MaxID == "" {
            break
        }
        pg.SinceID = ""
        pg.MinID = ""
        fmt.Printf("%+v\n", pg)
        time.Sleep(3 * time.Second)
        page++
    }
}

This worked. I have obtained 125 pages of 20 results. (I got the following string 125, Messages Found: 20) I confirmed that I could get over 70 pages, so I stopped it at 125 pages.

Perhaps the instance you are using has API restrictions.

carbontwelve commented 4 years ago

I did some experimenting after posting this issue and have discovered something interesting. I have an account on one instance that I last posted to in 2019, the API returned no results; however as soon as I began posting statuses on it I was able to obtain those. Similarly on the instance where I am able to get 70 pages of 20 items plus one page of 8 I posted enough statuses to push that over to 72 in total and re-ran my code to find it happily chugged through to 72 pages.

I did wonder if because my access of /api/v1/accounts/:id/statuses is unauthenticated (i'm only interested in public statuses anyway) that it hides posts beyond a certain time frame but I was able to get my first toot published on 08/27/2018 @ 8:22pm from my first instance.

In either case it's a bit weird but I am able to get a history, it might not be complete but it's good enough for my use and I am able to download statuses posted after I last ran my programme.