oss-aspen / Rappel

Apache License 2.0
10 stars 23 forks source link

Add recency and tf-idf weighted edges demo nb #243

Closed christinaexyou closed 10 months ago

christinaexyou commented 11 months ago

@JamesKunstle addresses #216 by demonstrating the preprocess,calc_recency_weights and calc_tfidf_weights methods in graph_helper_functions.ipynb.

cc: @hemajv @oindrillac

review-notebook-app[bot] commented 11 months ago

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

cdolfi commented 11 months ago

@christinaexyou what is the difference between this and #236?

christinaexyou commented 11 months ago

@cdolfi this PR walks through the logic/use cases of the the changes made in #236 to give more context. However, I'm not sure if we want to include in Rappel yet which is why I made a separate PR, waiting for feedback.

review-notebook-app[bot] commented 10 months ago

View / edit / reply to this conversation on ReviewNB

JamesKunstle commented on 2023-11-14T18:40:14Z ----------------------------------------------------------------

Line #22.    plt.show()

This is so cool.


review-notebook-app[bot] commented 10 months ago

View / edit / reply to this conversation on ReviewNB

JamesKunstle commented on 2023-11-14T18:40:15Z ----------------------------------------------------------------

This is fantastic work- I'd like to amend one term: "relevant." I think maybe "focused" would be clearer.

How would you interpret the case when the same node has many dark edges? e.g. right in the middle, 71292?


christinaexyou commented on 2023-11-15T13:53:09Z ----------------------------------------------------------------

It means that 71292 has a high issue volume contribution from a large set of contributors that create significantly less issues in other repos in the group. A dark edge or a relatively high TF-IDF value can indicate if a contributor has specialized knowledge since it means that they're contributing significantly to a single repo. Repo 71292 corresponds to the PyTorch Operator in KubeFlow which requires specialized knowledge of PyTorch. But given that we're looking at issues created, I doubt that every "focused" issue contributor has specialized knowledge.