opensource-observer / oso

Measuring the impact of open source software
https://opensource.observer
Apache License 2.0
68 stars 15 forks source link

feat: `usergroup` schema #1998

Closed ccerv1 closed 1 week ago

ccerv1 commented 2 weeks ago

Describe the feature you'd like to request

A usergroup is to users what a collection is to projects. A user should be able to specify a YAML file that enumerates an array of user_source_ids as part of a usergroup. For example:

version: x
name: kariba_data_collective
display_name: Active Members of Kariba Data Collective
git_users:
  - ccerv1
  - ryscheng
  - ravenac95
farcaster_users:
  - 529
  - 63755
  - 297654
eoas:
  - 0x...
  - 0x...

The collection should then resolve the distinct artifacts owned by members of the usergroup.

Perhaps the usergroup_namespace could be the sub-directory where the usergroup is defined ?

Describe the solution you'd like

We need both static usergroups and dynamic ones. A dynamic usergroup would be something like a bot filter or OpenRank-generated list.

I could imagine a new directory we maintain the looks something like this:

usergroups/
│
├── oso/
│   ├── models/
│   │   ├── oso_bot_filter.sql
│   │   ├── oso_power_addresses.sql
│   │
│   └── static/
│       ├── kariba_data_collective.yaml
│       └── data_challenge_winners.yaml
│
└── superchain/
    ├── models/
    │   ├── superchain_bot_filter.sql
    │   ├── superchain_power_addresses.sql
    │   └── superchain_trusted_users.sql
    │
    └── static/
        ├── op_airdrop_1.yaml
        ├── op_airdrop_2.yaml
        ├── op_badgeholders.yaml
        └── op_coredevs.yaml

We would then need some dbt macros to process these into a usergroups table.

Describe alternatives you've considered

We could continue to hardcode usergroups into our dbt models and upload static datasets to BigQuery.

ryscheng commented 2 weeks ago

Summarizing my thoughts from our conversation from today,

makes sense to start with the dbt model for this. I think there's a good chance 90% of our use cases will be covered by writing the dbt models, which could come from data connectors, or static tables from static_sources.

as a second step, it'd be cool to add some UX to the OSO web app to easily add / tag new user groups. We should file a separate issue for this. Not opposed to making a yaml schema in the future, but I think it might not be as useful as the prior 2 points.

ryscheng commented 1 week ago

Gonna close this out based on the discussion above. I'll comment on this issue for followon work https://github.com/opensource-observer/oso/issues/1997