Closed gentlementlegen closed 3 months ago
@gentlementlegen the deadline is at 2024-07-18T09:12:04.074Z
Hey the regex needs to be improved for the sentences.
For example, look when I write "i.e." this is credited as two sentences.
We must check for a letter, a period, and a space i guess. So something like \w\.\s
although I'm just winging it from my phone I'm sure this can be improved.
In addition, if there are excited contributors writing !! Then we should only credit it one time. Try and think of edge cases for the regexes.
Consider that some old school users use double space after a sentence stop. Also we can consider looking for a capitalized character after to indicate a new sentence has started, although this seems less robust potentially in case there are lazy writers in all lowercase.
Using ChatGpt could be a robust approach to break down sentences although I know that these AIs are infamous for being bad a counting things. I try feeding your comment and the result was correct however. I do not know if this is overkill or needed, but that's a possible path. We could even ask it to sort all the content per tag, and count content etc.
No definitely overkill. Just regex is sufficient.
We currently only use wordCount
to apply a value per word. Having word counter + sentence counter, how do they coexist with each other? What should be the total calculation formula? Currently we use ((count * wordValue) * (score * formattingMultiplier) * n) * relevance + task.reward = total
.
I think lets let the partners deal with that problem. For example, if they enable both then they need to do the math and figure out what rewards make sense with the double counting. I suggest that they should only use one or the other, but if they are really pro, they might figure a way to get them both enabled. Lets keep it simple.
I'll probably try some experiments in the near future when its stable.
But the main benefit of implementing this "symbols" section is that we should be able to target any arbitrary pattern, which seems like a generally useful tool.
So for now should I include them in the final calculation or have them as information to display only? Having patterns is nice but also error prone and hard to debug. I should be able to use the string as a regex.
Definitely include in the reward total. Do a regex eval so the partner can pass in any arbitrary pattern.
! Failed to run comment evaluation. SyntaxError: Unterminated string in JSON at position 2
@0x4007 This is also because of the use of a single token here. I'll have a look today, it seems to be the only case where it breaks, all the other ones are calculated properly with the dummy response.
What does the personal access token for the directory have to do with the conversation rewards Open AI call?
I am talking about the tokens allocated to OpenAI response generation.
View | Contribution | Count | Reward |
---|---|---|---|
Issue | Task | 1 | 200 |
Issue | Specification | 1 | 78 |
Issue | Comment | 5 | 26.96 |
Review | Comment | 28 | 0 |
Comment | Formatting | Relevance | Reward |
---|---|---|---|
I am not sure what the clearest way to express this is, but the … | 78content: h2: symbols: \b\w+\b: count: 86 multiplier: 0.1 score: 1 code: symbols: \b\w+\b: count: 32 multiplier: 0.1 score: 5 p: symbols: \b\w+\b: count: 14 multiplier: 0.1 score: 1 pre: symbols: \b\w+\b: count: 32 multiplier: 0.1 score: 0 hr: symbols: \b\w+\b: count: 1 multiplier: 0.1 score: 0 em: symbols: \b\w+\b: count: 14 multiplier: 0.1 score: 0 multiplier: 3 | 1 | 78 |
Using ChatGpt could be a robust approach to break down sentences… | 13.6content: p: symbols: \b\w+\b: count: 68 multiplier: 0.2 score: 1 multiplier: 1 | 0.6 | 8.16 |
We currently only use `wordCount` to apply a value per w… | 10.2content: p: symbols: \b\w+\b: count: 42 multiplier: 0.2 score: 1 code: symbols: \b\w+\b: count: 9 multiplier: 0.2 score: 1 multiplier: 1 | 0.9 | 9.18 |
So for now should I include them in the final calculation or hav… | 8.4content: p: symbols: \b\w+\b: count: 42 multiplier: 0.2 score: 1 multiplier: 1 | 0.8 | 6.72 |
@0x4007 This is also because of the use of a single token [here]… | 8.2content: p: symbols: \b\w+\b: count: 40 multiplier: 0.2 score: 1 a: symbols: \b\w+\b: count: 1 multiplier: 0.2 score: 1 multiplier: 1 | 0.3 | 2.46 |
I am talking about the tokens allocated to OpenAI response gener… | 2.2content: p: symbols: \b\w+\b: count: 11 multiplier: 0.2 score: 1 multiplier: 1 | 0.2 | 0.44 |
Resolves #62 Depends on #58, #55 <!-- - You must link the… | 0content: p: symbols: \b\w+\b: count: 6 multiplier: 0 score: 1 ul: symbols: \b\w+\b: count: 45 multiplier: 0 score: 1 li: symbols: \b\w+\b: count: 35 multiplier: 0 score: 1 multiplier: 0 | 0.1 | - |
I defaults to 0 when no found indeed. https://github.com/ubiqui… | 0content: p: symbols: \b\w+\b: count: 23 multiplier: 0.2 score: 1 multiplier: 0 | 1 | - |
That would we doable, just would need to be documented. The plus… | 0content: p: symbols: \b\w+\b: count: 28 multiplier: 0.2 score: 1 multiplier: 0 | 1 | - |
I fixed it on another PR, will fix it here too (it's just displa… | 0content: p: symbols: \b\w+\b: count: 23 multiplier: 0.2 score: 1 code: symbols: \b\w+\b: count: 1 multiplier: 0.2 score: 1 multiplier: 0 | 1 | - |
Yes, I think this was there because the defaults were not proper… | 0content: p: symbols: \b\w+\b: count: 19 multiplier: 0.2 score: 1 multiplier: 0 | 1 | - |
Honestly not sure about that one, that is possible. I can make a… | 0content: p: symbols: \b\w+\b: count: 18 multiplier: 0.2 score: 1 multiplier: 0 | 1 | - |
This is a shortcut to find back the Formatting value without hav… | 0content: p: symbols: \b\w+\b: count: 17 multiplier: 0.2 score: 1 img: symbols: \b\w+\b: count: 1 multiplier: 0.2 score: 0 multiplier: 0 | 1 | - |
Named it after the spec, I can change it. @0x4007 What do you pr… | 0content: p: symbols: \b\w+\b: count: 14 multiplier: 0.2 score: 1 multiplier: 0 | 1 | - |
Named it after the spec, I can change it. @0x4007 What do you pr… | 0content: p: symbols: \b\w+\b: count: 14 multiplier: 0.2 score: 1 multiplier: 0 | 1 | - |
The multiplier inheritance has been an ongoing discussion yes. B… | 0content: p: symbols: \b\w+\b: count: 67 multiplier: 0.2 score: 1 multiplier: 0 | 1 | - |
I did that because contents like `</img>` for exam… | 0content: p: symbols: \b\w+\b: count: 37 multiplier: 0.2 score: 1 code: symbols: \b\w+\b: count: 1 multiplier: 0.2 score: 1 multiplier: 0 | 1 | - |
@0x4007 Actually it happens more than you think I believe, for e… | 0content: p: symbols: \b\w+\b: count: 12 multiplier: 0.2 score: 1 ul: symbols: \b\w+\b: count: 23 multiplier: 0.2 score: 1 li: symbols: \b\w+\b: count: 23 multiplier: 0.2 score: 1 code: symbols: \b\w+\b: count: 1 multiplier: 0.2 score: 1 multiplier: 0 | 1 | - |
Yes but content should not be counted twice, as @whilefoo stated… | 0content: p: symbols: \b\w+\b: count: 28 multiplier: 0.2 score: 1 multiplier: 0 | 1 | - |
Currently each item has a default, which is all the same. Whenev… | 0content: p: symbols: \b\w+\b: count: 4 multiplier: 0.2 score: 1 pre: symbols: \b\w+\b: count: 2 multiplier: 0.2 score: 1 code: symbols: \b\w+\b: count: 2 multiplier: 0.2 score: 1 multiplier: 0 | 1 | - |
Sure we can have another task opened for that. Shall I rename to… | 0content: p: symbols: \b\w+\b: count: 18 multiplier: 0.2 score: 1 code: symbols: \b\w+\b: count: 1 multiplier: 0.2 score: 1 multiplier: 0 | 1 | - |
I can do that but I have to change the configuration schema agai… | 0content: p: symbols: \b\w+\b: count: 31 multiplier: 0.2 score: 1 code: symbols: \b\w+\b: count: 1 multiplier: 0.2 score: 1 multiplier: 0 | 1 | - |
Oken then I'll change the whole config shape, will put that PR i… | 0content: p: symbols: \b\w+\b: count: 23 multiplier: 0.2 score: 1 multiplier: 0 | 1 | - |
@whilefoo yes that's pretty much what is happening there. Shall … | 0content: p: symbols: \b\w+\b: count: 26 multiplier: 0.2 score: 1 multiplier: 0 | 1 | - |
@0x4007 I guess this will be covered by https://github.com/ubiqu… | 0content: p: symbols: \b\w+\b: count: 16 multiplier: 0.2 score: 1 multiplier: 0 | 1 | - |
I will rename the variables to match the configuration. The `… | 0content: p: symbols: \b\w+\b: count: 40 multiplier: 0.2 score: 1 code: symbols: \b\w+\b: count: 1 multiplier: 0.2 score: 1 multiplier: 0 | 1 | - |
## What's new The following changes in the configuration have b… | 0content: h2: symbols: \b\w+\b: count: 2 multiplier: 0.2 score: 1 p: symbols: \b\w+\b: count: 11 multiplier: 0.2 score: 1 ul: symbols: \b\w+\b: count: 49 multiplier: 0.2 score: 1 li: symbols: \b\w+\b: count: 19 multiplier: 0.2 score: 1 code: symbols: \b\w+\b: count: 3 multiplier: 0.2 score: 1 a: symbols: \b\w+\b: count: 2 multiplier: 0.2 score: 1 multiplier: 0 | 1 | - |
@rndquu Seems there is a lot of tabulations [here](https://githu… | 0content: p: symbols: \b\w+\b: count: 9 multiplier: 0.2 score: 1 a: symbols: \b\w+\b: count: 1 multiplier: 0.2 score: 1 pre: symbols: \b\w+\b: count: 178 multiplier: 0.2 score: 1 code: symbols: \b\w+\b: count: 1 multiplier: 0.2 score: 1 multiplier: 0 | 1 | - |
@rndquu I could finally reproduce on my own branch as well, than… | 0content: p: symbols: \b\w+\b: count: 16 multiplier: 0.2 score: 1 multiplier: 0 | 1 | - |
@rndquu Sorry about all the fixes. I tested with the following c… | 0content: p: symbols: \b\w+\b: count: 6 multiplier: 0.2 score: 1 pre: symbols: \b\w+\b: count: 37 multiplier: 0.2 score: 1 code: symbols: \b\w+\b: count: 37 multiplier: 0.2 score: 1 multiplier: 0 | 1 | - |
@0x4007 Covered by https://github.com/ubiquibot/conversation-rew… | 0content: p: symbols: \b\w+\b: count: 11 multiplier: 0.2 score: 1 multiplier: 0 | 1 | - |
@Keyrxng There were major configuration changes after the last r… | 0content: p: symbols: \b\w+\b: count: 52 multiplier: 0.2 score: 1 pre: symbols: \b\w+\b: count: 175 multiplier: 0.2 score: 1 code: symbols: \b\w+\b: count: 1 multiplier: 0.2 score: 1 multiplier: 0 | 1 | - |
@Keyrxng can make it as another task because I think it is trick… | 0content: p: symbols: \b\w+\b: count: 53 multiplier: 0.2 score: 1 code: symbols: \b\w+\b: count: 1 multiplier: 0.2 score: 1 multiplier: 0 | 1 | - |
@Keyrxng please make a spec 😄 | 0content: p: symbols: \b\w+\b: count: 5 multiplier: 0.2 score: 1 multiplier: 0 | 1 | - |
View | Contribution | Count | Reward |
---|---|---|---|
Issue | Comment | 5 | 20.96 |
Review | Comment | 17 | 55.7 |
Comment | Formatting | Relevance | Reward |
---|---|---|---|
Hey the regex needs to be improved for the sentences. For examp… | 13content: p: symbols: \b\w+\b: count: 128 multiplier: 0.1 score: 1 code: symbols: \b\w+\b: count: 2 multiplier: 0.1 score: 1 multiplier: 1 | 1 | 13 |
No definitely overkill. Just regex is sufficient. | 0.7content: p: symbols: \b\w+\b: count: 7 multiplier: 0.1 score: 1 multiplier: 1 | 0.2 | 0.14 |
I think lets let the partners deal with that problem. For exampl… | 10.5content: p: symbols: \b\w+\b: count: 105 multiplier: 0.1 score: 1 multiplier: 1 | 0.6 | 6.3 |
Definitely include in the reward total. Do a regex eval so the p… | 1.9content: p: symbols: \b\w+\b: count: 19 multiplier: 0.1 score: 1 multiplier: 1 | 0.8 | 1.52 |
What does the personal access token for the directory have to do… | 0content: p: symbols: \b\w+\b: count: 19 multiplier: 0.1 score: 1 multiplier: 1 | - | - |
```suggestion Here is a possible valid configuratio… | 3.6content: pre: symbols: \b\w+\b: count: 18 multiplier: 0.1 score: 1 code: symbols: \b\w+\b: count: 18 multiplier: 0.1 score: 1 multiplier: 1 | 1 | 3.6 |
```suggestion readonly _configuration: DataPurgeC… | 1.9content: pre: symbols: \b\w+\b: count: 7 multiplier: 0.1 score: 1 code: symbols: \b\w+\b: count: 1 multiplier: 0.1 score: 1 p: symbols: \b\w+\b: count: 11 multiplier: 0.1 score: 1 multiplier: 1 | 1 | 1.9 |
This seems like an incomplete list of HTML entities. I assume th… | 2.1content: p: symbols: \b\w+\b: count: 20 multiplier: 0.1 score: 1 code: symbols: \b\w+\b: count: 1 multiplier: 0.1 score: 1 multiplier: 1 | 1 | 2.1 |
Might be useful to make special aliases for our config to make i… | 3.7content: p: symbols: \b\w+\b: count: 34 multiplier: 0.1 score: 1 code: symbols: \b\w+\b: count: 3 multiplier: 0.1 score: 1 multiplier: 1 | 1 | 3.7 |
Default to `1` on failure? | 0.6content: p: symbols: \b\w+\b: count: 5 multiplier: 0.1 score: 1 code: symbols: \b\w+\b: count: 1 multiplier: 0.1 score: 1 multiplier: 1 | 1 | 0.6 |
I suppose we can do a wildcard select first and then the later c… | 2.6content: p: symbols: \b\w+\b: count: 22 multiplier: 0.1 score: 1 pre: symbols: \b\w+\b: count: 2 multiplier: 0.1 score: 1 code: symbols: \b\w+\b: count: 2 multiplier: 0.1 score: 1 multiplier: 1 | 1 | 2.6 |
I'm not sure. I'm reviewing all of the property names next to ea… | 11.3content: p: symbols: \b\w+\b: count: 23 multiplier: 0.1 score: 1 pre: symbols: \b\w+\b: count: 45 multiplier: 0.1 score: 1 code: symbols: \b\w+\b: count: 45 multiplier: 0.1 score: 1 multiplier: 1 | 1 | 11.3 |
I've never seen people writing html comments like this. I don't … | 1.9content: p: symbols: \b\w+\b: count: 19 multiplier: 0.1 score: 1 multiplier: 1 | 1 | 1.9 |
Not sure, perhaps `pattern: ` however I know I had anoth… | 2.1content: p: symbols: \b\w+\b: count: 13 multiplier: 0.1 score: 1 code: symbols: \b\w+\b: count: 4 multiplier: 0.1 score: 1 pre: symbols: \b\w+\b: count: 4 multiplier: 0.1 score: 1 multiplier: 1 | 1 | 2.1 |
I didn't see this when I authored the first version | 1.1content: p: symbols: \b\w+\b: count: 11 multiplier: 0.1 score: 1 multiplier: 1 | 1 | 1.1 |
I never would select `img` in the config because `a&… | 2content: p: symbols: \b\w+\b: count: 19 multiplier: 0.1 score: 1 code: symbols: \b\w+\b: count: 1 multiplier: 0.1 score: 1 multiplier: 1 | 1 | 2 |
I don't understand your example but if there is a code block ins… | 8.5content: p: symbols: \b\w+\b: count: 82 multiplier: 0.1 score: 1 a: symbols: \b\w+\b: count: 2 multiplier: 0.1 score: 1 code: symbols: \b\w+\b: count: 1 multiplier: 0.1 score: 1 multiplier: 1 | 1 | 8.5 |
So the strategy is to take the parent or the child element? I su… | 2content: p: symbols: \b\w+\b: count: 20 multiplier: 0.1 score: 1 multiplier: 1 | 1 | 2 |
My goal is to make the config as intuitive as possible, sorry fo… | 1.5content: p: symbols: \b\w+\b: count: 15 multiplier: 0.1 score: 1 multiplier: 1 | 1 | 1.5 |
Let's do what was defined in my example config above. https://g… | 2content: p: symbols: \b\w+\b: count: 20 multiplier: 0.1 score: 1 multiplier: 1 | 1 | 2 |
At least in the config we renamed things accordingly. For exampl… | 8.2content: p: symbols: \b\w+\b: count: 81 multiplier: 0.1 score: 1 code: symbols: \b\w+\b: count: 1 multiplier: 0.1 score: 1 multiplier: 1 | 1 | 8.2 |
The "enabled" properties should be removed | 0.6content: p: symbols: \b\w+\b: count: 6 multiplier: 0.1 score: 1 multiplier: 1 | 1 | 0.6 |
View | Contribution | Count | Reward |
---|---|---|---|
Review | Comment | 12 | 10.725 |
Comment | Formatting | Relevance | Reward |
---|---|---|---|
why is there a need for merge? can't you just use `Value.Def… | 0.65content: p: symbols: \b\w+\b: count: 19 multiplier: 0.1 score: 1 code: symbols: \b\w+\b: count: 7 multiplier: 0.1 score: 1 multiplier: 0.25 | 1 | 0.65 |
```suggestion * If set to false, the plugi… | 0.9content: pre: symbols: \b\w+\b: count: 18 multiplier: 0.1 score: 1 code: symbols: \b\w+\b: count: 18 multiplier: 0.1 score: 1 multiplier: 0.25 | 1 | 0.9 |
won't this count twice? for example: ``` <p>… | 0.625content: p: symbols: \b\w+\b: count: 16 multiplier: 0.1 score: 1 pre: symbols: \b\w+\b: count: 8 multiplier: 0.1 score: 1 code: symbols: \b\w+\b: count: 1 multiplier: 0.1 score: 1 multiplier: 0.25 | 1 | 0.625 |
shouldn't it multiply by relevance not divide? | 0.2content: p: symbols: \b\w+\b: count: 8 multiplier: 0.1 score: 1 multiplier: 0.25 | 1 | 0.2 |
if there's no match shouldn't we set to 0? | 0.275content: p: symbols: \b\w+\b: count: 11 multiplier: 0.1 score: 1 multiplier: 0.25 | 1 | 0.275 |
In my memory score was basically a reward for each tag, for exam… | 2.275content: p: symbols: \b\w+\b: count: 91 multiplier: 0.1 score: 1 multiplier: 0.25 | 1 | 2.275 |
maybe it'd be better to name this `tags` or `tagMult… | 0.3content: p: symbols: \b\w+\b: count: 11 multiplier: 0.1 score: 1 code: symbols: \b\w+\b: count: 1 multiplier: 0.1 score: 1 multiplier: 0.25 | 1 | 0.3 |
I always associate symbols with special characters like ^%$#, so… | 0.475content: p: symbols: \b\w+\b: count: 18 multiplier: 0.1 score: 1 code: symbols: \b\w+\b: count: 1 multiplier: 0.1 score: 1 multiplier: 0.25 | 1 | 0.475 |
if I want to create a multiplier that's not specific to a partic… | 0.95content: p: symbols: \b\w+\b: count: 38 multiplier: 0.1 score: 1 multiplier: 0.25 | 1 | 0.95 |
Of course people don't write comments in HTML but when markdown … | 0.6content: p: symbols: \b\w+\b: count: 24 multiplier: 0.1 score: 1 multiplier: 0.25 | 1 | 0.6 |
can you help me grasp how this calculation work? At first for e… | 3.125content: p: symbols: \b\w+\b: count: 124 multiplier: 0.1 score: 1 code: symbols: \b\w+\b: count: 1 multiplier: 0.1 score: 1 multiplier: 0.25 | 1 | 3.125 |
that was also my understanding but currently in the code it's a … | 0.35content: p: symbols: \b\w+\b: count: 14 multiplier: 0.1 score: 1 multiplier: 0.25 | 1 | 0.35 |
View | Contribution | Count | Reward |
---|---|---|---|
Review | Comment | 2 | 4.75 |
Comment | Formatting | Relevance | Reward |
---|---|---|---|
@gentlementlegen Check [this](https://github.com/rndquu-org/test… | 3.725content: p: symbols: \b\w+\b: count: 4 multiplier: 0.1 score: 1 a: symbols: \b\w+\b: count: 1 multiplier: 0.1 score: 1 pre: symbols: \b\w+\b: count: 72 multiplier: 0.1 score: 1 code: symbols: \b\w+\b: count: 72 multiplier: 0.1 score: 1 multiplier: 0.25 | 1 | 3.725 |
@gentlementlegen The `Invalid incentives configuration detec… | 1.025content: p: symbols: \b\w+\b: count: 36 multiplier: 0.1 score: 1 code: symbols: \b\w+\b: count: 4 multiplier: 0.1 score: 1 a: symbols: \b\w+\b: count: 1 multiplier: 0.1 score: 1 multiplier: 0.25 | 1 | 1.025 |
View | Contribution | Count | Reward |
---|---|---|---|
Review | Comment | 3 | 60.1 |
Comment | Formatting | Relevance | Reward |
---|---|---|---|
@gentlementlegen where can I find a working config setup? Is the… | 12.7content: p: symbols: \b\w+\b: count: 12 multiplier: 0.1 score: 1 a: symbols: \b\w+\b: count: 1 multiplier: 0.1 score: 1 code: symbols: \b\w+\b: count: 4 multiplier: 0.1 score: 1 pre: symbols: \b\w+\b: count: 4 multiplier: 0.1 score: 1 h2: symbols: \b\w+\b: count: 106 multiplier: 0.1 score: 1 multiplier: 1 | 1 | 12.7 |
- https://github.com/ubq-testing/conversation-rewards/issues/1 … | 30.8content: ul: symbols: \b\w+\b: count: 153 multiplier: 0.1 score: 1 li: symbols: \b\w+\b: count: 153 multiplier: 0.1 score: 1 code: symbols: \b\w+\b: count: 2 multiplier: 0.1 score: 1 multiplier: 1 | 1 | 30.8 |
Very weird you'd expect that a fork of your branch, an exact con… | 16.6content: h2: symbols: \b\w+\b: count: 52 multiplier: 0.1 score: 1 p: symbols: \b\w+\b: count: 111 multiplier: 0.1 score: 1 a: symbols: \b\w+\b: count: 2 multiplier: 0.1 score: 1 code: symbols: \b\w+\b: count: 1 multiplier: 0.1 score: 1 multiplier: 1 | 1 | 16.6 |
I am not sure what the clearest way to express this is, but the idea is that we have "word count" currently. I figured a generalized way to approach this was to target symbols instead.
The way that we count the amount of words is by counting the spaces.
Extending this logic, I figured we could credit sentences by assigning value to punctuation (
!
,.
etc)We can credit paragraphs by assigning value to double line breaks (
\n\n
)Do you see where I am going with this?
Perhaps it makes sense to use regex notation instead in the config:
I am mostly happy with this config syntax because it allows us to fairly clearly express, and finely adjust the incentives.
_Originally posted by @0x4007 in https://github.com/ubiquity/ubiquibot-config/pull/18#discussion_r1680319560_