opensource-observer / oso

Measuring the impact of open source software
https://opensource.observer
Apache License 2.0
68 stars 16 forks source link

Eigentrust over the events table #578

Closed ryscheng closed 3 months ago

ryscheng commented 10 months ago

Describe the feature you'd like to request

The events table defines a graph. From, to, and amount/type.

We can run graph algorithms over the graph (like PageRank) to get interesting insights

Describe the solution you'd like

This could run as a separate batch job. This could be kind of meta, where it produces a new event type that is also time series. That way you can see PageRank over time.

We may also have different runs on different subsets of data (e.g. onchain vs GitHub)

Describe alternatives you've considered

TBD

ccerv1 commented 3 months ago

Colab version: https://colab.research.google.com/drive/1ojei1TSIGODbV-EhiLC12hC2OG1wKBH1?usp=sharing

Jupyter Notebook version: https://github.com/opensource-observer/insights/blob/main/community/data_challenges/openrank/OpenRank_Starter.ipynb

It's literally as easy as:

query = """
    select
      from_artifact_id as i,
      to_artifact_id as j,
      amount as v
    from `opensource-observer.oso.int_events`
    where event_type = # EVENT_TYPE
"""
result = client.query(query)
dataframe = result.to_dataframe()
localtrust = dataframe.to_dict("records")

a = EigenTrust()
scores = a.run_eigentrust(localtrust)