How to conveniently convert paginate result to pandas DataFrame?

yanyongyu / githubkit

The modern, all-batteries-included GitHub SDK for Python, including rest api, graphql, webhooks, like octokit!

https://yanyongyu.github.io/githubkit/

MIT License

177 stars 25 forks source link

How to conveniently convert paginate result to pandas DataFrame? #19

Closed tisonkun closed 1 year ago

tisonkun commented 1 year ago

Said I fetch the pulls as:

    c = obtain_client()
    prs = c.paginate(
        c.rest.pulls.list,
        owner="apache",
        repo="pulsar",
        state="open")

Now prs is a generator of PullRequestSimple list. How can I convert it to a pandas DataFrame as:

   url id node_id ... draft
0 ... nnn mmm ... False
...

tisonkun commented 1 year ago

I just found a way now:

    c = obtain_client()
    prs = c.paginate(
        c.rest.pulls.list,
        owner="apache",
        repo="pulsar",
        state="open")
    df = pd.DataFrame([vars(pr) for pr in prs])

But I still wonder if we can provide some utilities to interoperate with pandas smoothly.

yanyongyu commented 1 year ago

githubkit uses pydantic model to serialize data. You can search for the way to convert pydantic models into pandas DataFrame. In the code above, you may change the vars(pr) into pr.dict(). More infomation about the usage, you can check the pydantic docs

tisonkun commented 1 year ago

@yanyongyu Thank you!