ubiquity-os-marketplace / text-conversation-rewards

1 stars 34 forks source link

Deduplication Should Not Receive Credit #276

Closed 0x4007 closed 1 month ago

0x4007 commented 2 months ago

We use footnote links for /annotate and issue deduplication

Excessive credit is being given to the author of the comment with any of those generated footnotes

https://github.com/ubiquity-os-marketplace/command-wallet/issues/47#issuecomment-2661604653

Do not count deduplicated footnotes. I suppose we can target the 0 prefix in the footnote ID but it also makes me think we should think of a more robust targeting system for the future. For now regex footnotes with ids starting with that - should be fine to remove from the content during preprocessing.

In the near future, we should actually prefix the generated footnotes with something more precise like dedupe- or no-credit or even better, check the diff and only offer credit if the user wrote the link, not a bot. [^01^]

If we can identify who wrote what, then we can extremely precisely reward credit where its due.

[^01^]: ⚠ 77% possible duplicate - Crediting for unique links only

ubiquity-os-beta[bot] commented 2 months ago

[!NOTE] The following contributors may be suitable for this task:

gentlementlegen

80% Match ubiquity-os-marketplace/text-conversation-rewards#155

koya0

80% Match ubiquity-os-marketplace/text-vector-embeddings#67

ubiquity-os-beta[bot] commented 2 months ago

[!NOTE] The following contributors may be suitable for this task:

koya0

81% Match ubiquity-os-marketplace/text-vector-embeddings#67

Keyrxng

79% Match ubiquity/devpool-directory-tasks#15

gentlementlegen

77% Match ubiquity-os-marketplace/text-conversation-rewards#155

whilefoo commented 2 months ago

Since all deduplication notes start with ⚠ xx% possible duplicate and annotate with xx% similar to issue, we can target that to avoid removing author's footnotes (I wonder if the plugin clashes with author's footnotes).

Another option is to enclose plugin's footnotes with <!-- deduplication-footnotes-start --> and <!-- deduplication-footnotes-end -->.

0x4007 commented 2 months ago

Possible duplicate I think we could change to say something like like possibly related.

I think the 0 prefix or something similar could be a good indicator.

I always like metadata as well.

gentlementlegen commented 1 month ago

Another option is to enclose plugin's footnotes with and .

I like the idea but I think links would be credited still because the link within the comment (e.g. note[1]) would probably count as a link. Actually I think that footnotes themselves are already excluded (their content is not evaluated), the rewards are because of these footnotes link in the spec itself.

https://github.com/ubiquity-os-marketplace/text-conversation-rewards/blob/b9189bd8af038e750c4e9843a6b4599f25809298/src/parser/data-purge-module.ts#L65

One idea could be to ignore links redirecting to the same url as the issue they are written on?

I wonder if the plugin clashes with author's footnotes

Probably all the footnotes will be excluded.

gentlementlegen commented 1 month ago

/start

ubiquity-os-beta[bot] commented 1 month ago
Beneficiary 0x0fC1b909ba9265A846b82CF4CE352fc3e7EeB2ED

[!TIP]

  • Use /wallet 0x0000...0000 if you want to update your registered payment wallet address.
  • Be sure to open a draft pull request as soon as possible to communicate updates on your progress.
  • Be sure to provide timely updates to us when requested, or you will be automatically unassigned from the task.
ubiquity-os-beta[bot] commented 1 month ago

[!IMPORTANT]

  • Be sure to link a pull-request before the first reminder to avoid disqualification.
  • Reminders will be sent every 21 hours if there is no activity.
  • Assignees will be disqualified after 1 day and 18 hours of inactivity.
0x4007 commented 1 month ago

Another option is to enclose plugin's footnotes with and .

I like the idea but I think links would be credited still because the link within the comment (e.g. note[1]) would probably count as a link. Actually I think that footnotes themselves are already excluded (their content is not evaluated), the rewards are because of these footnotes link in the spec itself.

text-conversation-rewards/src/parser/data-purge-module.ts

Line 65 in b9189bd

.replace(/^###### .?[\^\d+\^][\s\S]$/gm, "") One idea could be to ignore links redirecting to the same url as the issue they are written on?

I wonder if the plugin clashes with author's footnotes

Probably all the footnotes will be excluded.

Footnotes can be distinguished against normal links surely. Lets take a step back and think about it.

The portion inside the body should be a hash link to the same page the comment is written on. The footnote may have a normal link that goes to an external source which we should reward.

If we check the revision history and see who wrote it, we can assign credit accordingly. If the bot wrote something, then no credit is generated. etc.

Truly I think that seeing who wrote what is the most robust solution. We should definitely be doing this.

gentlementlegen commented 1 month ago

A footnote in MD form is text[^1] and [^1]: footnote. Transformed into html it is a [^01^]: ⚠ 67% possible duplicate - <a href="https://www.github.com/ubiquity-os-marketplace/command-wallet/issues/20#20">no docs for populating supabase</a> . I believe GitHub parses it and transforms it on its side, no this is not visible in html either, not distinguishable.

We could eventually rely on the revision. But this also will collide with the rewrite plugin, so issue specs won't be rewarded anymore if the user triggered a rewriting because the changes will be authored by the bot.

0x4007 commented 1 month ago

But this also will collide with the rewrite plugin, so issue specs won't be rewarded anymore if the user triggered a rewriting because the changes will be authored by the bot.

If this is the only exception lets allow it to be a problem but only map it to /rewrite instead of including the time label change. Alternatively we can adjust the prompt to make absolutely minimal changes to the spec, which I think is a good idea either way. If it leaves some of the original authored content then this is perfectly valid!


Footnotes in the source code when I write them looks like this:

My footnote[^1^]

[^1^]: details
0x4007 commented 1 month ago

test ^1

0x4007 commented 1 month ago

You taught me something new, I didn't realize that we only need a single ^

gentlementlegen commented 1 month ago

Okay then maybe let's use regex as an immediate fix, and forward let's solve https://github.com/ubiquity-os-marketplace/text-conversation-rewards/issues/201 so we have a precise evaluation of footnotes based on the comment history.

ubiquity-os-beta[bot] commented 1 month ago

 [ 108.12 WXDAI ] 

@gentlementlegen
Contributions Overview
ViewContributionCountReward
IssueTask1100
IssueComment18.12
Conversation Incentives
CommentFormattingRelevancePriorityReward
I like the idea but I think links would be credited still becaus…
9.06
content:
  content:
    p:
      score: 0
      elementCount: 1
    a:
      score: 5
      elementCount: 1
  result: 5
regex:
  wordCount: 78
  wordValue: 0.1
  result: 4.06
0.748.12

 [ 11.28 WXDAI ] 

@whilefoo
Contributions Overview
ViewContributionCountReward
IssueComment111.28
Conversation Incentives
CommentFormattingRelevancePriorityReward
Since all deduplication notes start with `⚠ xx% possible dup…
2.2
content:
  content:
    p:
      score: 0
      elementCount: 1
  result: 0
regex:
  wordCount: 38
  wordValue: 0.1
  result: 2.2
1411.28

 [ 86.704 WXDAI ] 

@0x4007
Contributions Overview
ViewContributionCountReward
ReviewBase Review for #308125
ReviewCode Review11.32
IssueSpecification114.44
IssueComment545.944
Review Details for #308
ChangesPriorityReward
+28 -541.32
Conversation Incentives
CommentFormattingRelevancePriorityReward
We use footnote links for `/annotate` and issue deduplic…
11.3
content:
  content:
    p:
      score: 0
      elementCount: 1
    br:
      score: 0
      elementCount: 2
    a:
      score: 5
      elementCount: 1
  result: 5
regex:
  wordCount: 131
  wordValue: 0.1
  result: 6.3
1114.44
Possible duplicate I think we could change to say something like…
1.95
content:
  content:
    p:
      score: 0
      elementCount: 1
  result: 0
regex:
  wordCount: 33
  wordValue: 0.1
  result: 1.95
0.747.056
Footnotes can be distinguished against normal links surely. Lets…
4.97
content:
  content:
    p:
      score: 0
      elementCount: 1
  result: 0
regex:
  wordCount: 99
  wordValue: 0.1
  result: 4.97
1424.88
If this is the only exception lets allow it to be a problem but …
4.88
content:
  content:
    h2:
      score: 1
      elementCount: 1
    p:
      score: 0
      elementCount: 1
  result: 1
regex:
  wordCount: 74
  wordValue: 0.1
  result: 3.88
0.5412.52
test
0.1
content:
  content:
    p:
      score: 0
      elementCount: 1
  result: 0
regex:
  wordCount: 1
  wordValue: 0.1
  result: 0.1
040
You taught me something new, I didn't realize that we only need …
1
content:
  content:
    p:
      score: 0
      elementCount: 1
  result: 0
regex:
  wordCount: 15
  wordValue: 0.1
  result: 1
0.341.488