ubiquity-os / plugins-wishlist

0 stars 2 forks source link

`issues_comment.created` vector embeddings #32

Closed 0x4007 closed 3 weeks ago

0x4007 commented 1 month ago

This task serves as the initial setup but once the repository is made we can migrate those old issues. The old issues will also probably need to be updated with more modern technologies. Generally the same strategy on the approach will be used still.

sshivaditya2019 commented 4 weeks ago

/start

ubiquity-os[bot] commented 4 weeks ago
! Please set your wallet address with the /wallet command first and try again.
sshivaditya2019 commented 4 weeks ago

/wallet 0xDAba6e01D15Db560b88C8F426b016801f79e1F69

ubiquity-os[bot] commented 4 weeks ago

+ Successfully registered wallet address
sshivaditya2019 commented 4 weeks ago

/start

ubiquity-os[bot] commented 4 weeks ago
DeadlineTue, Aug 27, 6:40 PM UTC
Registered Wallet 0xDAba6e01D15Db560b88C8F426b016801f79e1F69
Tips:
<ul>
<li>Use <code>/wallet 0x0000...0000</code> if you want to update your registered payment wallet address.</li>
<li>Be sure to open a draft pull request as soon as possible to communicate updates on your progress.</li>
<li>Be sure to provide timely updates to us when requested, or you will be automatically unassigned from the task.</li>
<ul>
0x4007 commented 4 weeks ago

@sshivaditya2019 supabase has first class support for vector embeddings which might be easier for you to get started with.

Cloudflare might also have a solution which might be preferable because we generally run these plugins as cloudflare workers

sshivaditya2019 commented 4 weeks ago

@0x4007 Could you provide additional details on this issue?

So, all the existing comments will be stored in vector store. Ann when comments are edited they have to updated in the vector store. Is that right ?

0x4007 commented 4 weeks ago

I would make the row ID the comment id. You can also store the issue specification by storing the issue id (not issue number of the repo, but id which is a large number)

0x4007 commented 4 weeks ago

@Keyrxng do you have any resources for how @sshivaditya2019 can get started on building a plugin?

sshivaditya2019 commented 4 weeks ago

So, the issues (along with spec), and the comments would be stored in a db (Separate Tables), but we need embeddings only for the comments. Is that right ?

Keyrxng commented 4 weeks ago
  1. https://github.com/ubiquibot/plugin-template - It contains a small hello-world example and the readme has various links and tips.
  2. Use the most recent plugins for reference such as: command-start-stop, automated-merging, command-wallet
  3. If you are not already familiar with the kernel and the prerequisites of plugin building head here

@0x4007 Maybe it's time for an early version of official docs? Covering things like secret passing and description of the main components, best practices etc? There is the videos that gitcoindev made I will find those and link them

Also could you create a repo for this plugin so that @sshivaditya2019 can fork the repo for an easy PR?

0x4007 commented 4 weeks ago

So, the issues (along with spec), and the comments would be stored in a db (Separate Tables), but we need embeddings only for the comments. Is that right ?

Just issue body and comments. Same table. IDs shouldn't collide.

0x4007 commented 4 weeks ago

Also could you create a repo for this plugin so that @sshivaditya2019 can fork the repo for an easy PR?

I'm not on my computer so they can fork from you for now

Keyrxng commented 4 weeks ago

@sshivaditya2019 you can fork this repo

@0x4007 I have a poor track record with naming plugins I think lmao it probs could have a better name

ubiquity-os[bot] commented 3 weeks ago

[ 600 WXDAI ]

@sshivaditya2019
Contributions Overview
View Contribution Count Reward
Issue Task 1 600
Issue Comment 2 0
Review Comment 26 0
Conversation Incentives
Comment Formatting Relevance Reward
@0x4007 Could you provide additional details on this issue? So,…
0
content:
  p:
    symbols:
      \b\w+\b:
        count: 36
        multiplier: 0
    score: 1
multiplier: 0
0.8 -
So, the issues (along with spec), and the comments would be stor…
0
content:
  p:
    symbols:
      \b\w+\b:
        count: 28
        multiplier: 0
    score: 1
multiplier: 0
0.7 -
Resolves [#32](https://github.com/ubiquibot/plugins-wishlist/iss…
0
content:
  p:
    symbols:
      \b\w+\b:
        count: 2
        multiplier: 0
    score: 1
  a:
    symbols:
      \b\w+\b:
        count: 1
        multiplier: 0
    score: 1
  ul:
    symbols:
      \b\w+\b:
        count: 95
        multiplier: 0
    score: 1
  li:
    symbols:
      \b\w+\b:
        count: 31
        multiplier: 0
    score: 1
  code:
    symbols:
      \b\w+\b:
        count: 10
        multiplier: 0
    score: 1
  pre:
    symbols:
      \b\w+\b:
        count: 10
        multiplier: 0
    score: 0
multiplier: 0
1 -
We are validating the schema before running the plugin. I think …
0
content:
  p:
    symbols:
      \b\w+\b:
        count: 15
        multiplier: 0.2
    score: 1
multiplier: 0
1 -
<img width="862" alt="image" src="https://github.com/user-att…
0
content:
  p:
    symbols:
      \b\w+\b:
        count: 38
        multiplier: 0.2
    score: 1
multiplier: 0
1 -
I am trying to replicate this, but I am not able to launch wrang…
0
content:
  p:
    symbols:
      \b\w+\b:
        count: 27
        multiplier: 0.2
    score: 1
multiplier: 0
1 -
Could be expanded to add support, for text generation and questi…
0
content:
  p:
    symbols:
      \b\w+\b:
        count: 22
        multiplier: 0.2
    score: 1
multiplier: 0
1 -
- Vector search with large model could be slow with large number…
0
content:
  ul:
    symbols:
      \b\w+\b:
        count: 90
        multiplier: 0.2
    score: 1
  li:
    symbols:
      \b\w+\b:
        count: 46
        multiplier: 0.2
    score: 1
multiplier: 0
1 -
No, I don't have anything running on port 4000. Possible bug i…
0
content:
  p:
    symbols:
      \b\w+\b:
        count: 17
        multiplier: 0.2
    score: 1
  a:
    symbols:
      \b\w+\b:
        count: 2
        multiplier: 0.2
    score: 1
multiplier: 0
1 -
Just to be clear, this is a design issue as well. Depending on t…
0
content:
  p:
    symbols:
      \b\w+\b:
        count: 92
        multiplier: 0.2
    score: 1
multiplier: 0
1 -
Fixed in the latest commit, uses large model.
0
content:
  p:
    symbols:
      \b\w+\b:
        count: 8
        multiplier: 0.2
    score: 1
multiplier: 0
1 -
Isn't that the expected behavior ? In case of collision it shoul…
0
content:
  p:
    symbols:
      \b\w+\b:
        count: 25
        multiplier: 0.2
    score: 1
multiplier: 0
1 -
While its running, the `issue_comment.created` will crea…
0
content:
  p:
    symbols:
      \b\w+\b:
        count: 44
        multiplier: 0.2
    score: 1
  code:
    symbols:
      \b\w+\b:
        count: 2
        multiplier: 0.2
    score: 1
multiplier: 0
1 -
This should be fixed in the latest commit. I have added it in th…
0
content:
  p:
    symbols:
      \b\w+\b:
        count: 15
        multiplier: 0.2
    score: 1
multiplier: 0
1 -
@gentlementlegen How about `Worker plugin for generating vec…
0
content:
  p:
    symbols:
      \b\w+\b:
        count: 18
        multiplier: 0.2
    score: 1
  code:
    symbols:
      \b\w+\b:
        count: 15
        multiplier: 0.2
    score: 1
multiplier: 0
1 -
It was there for `createContext`. There was comment that…
0
content:
  p:
    symbols:
      \b\w+\b:
        count: 28
        multiplier: 0.2
    score: 1
  code:
    symbols:
      \b\w+\b:
        count: 14
        multiplier: 0.2
    score: 1
multiplier: 0
1 -
Removed that. I was under the impression kernel expects that val…
0
content:
  p:
    symbols:
      \b\w+\b:
        count: 11
        multiplier: 0.2
    score: 1
multiplier: 0
1 -
But, as mentioned before, I expect this to run in a Cloudflare w…
0
content:
  p:
    symbols:
      \b\w+\b:
        count: 22
        multiplier: 0.2
    score: 1
multiplier: 0
1 -
@0x4007
0
content:
  p:
    symbols:
      \b\w+\b:
        count: 1
        multiplier: 0.2
    score: 1
multiplier: 0
1 -
Fixed added schema for table. Unit tests work now. Schema typing…
0
content:
  p:
    symbols:
      \b\w+\b:
        count: 27
        multiplier: 0.2
    score: 1
multiplier: 0
1 -
Added Github Workflow, and fixed the package.json.
0
content:
  p:
    symbols:
      \b\w+\b:
        count: 8
        multiplier: 0.2
    score: 1
multiplier: 0
1 -
Knip should be fixed in the latest commit.
0
content:
  p:
    symbols:
      \b\w+\b:
        count: 8
        multiplier: 0.2
    score: 1
multiplier: 0
1 -
- I expected this to run on worker, since action takes too long …
0
content:
  ul:
    symbols:
      \b\w+\b:
        count: 44
        multiplier: 0.2
    score: 1
  li:
    symbols:
      \b\w+\b:
        count: 3
        multiplier: 0.2
    score: 1
  code:
    symbols:
      \b\w+\b:
        count: 2
        multiplier: 0.2
    score: 1
multiplier: 0
1 -
Could you please confirm if the ` .ubiquibot-config.yml`…
0
content:
  p:
    symbols:
      \b\w+\b:
        count: 89
        multiplier: 0.2
    score: 1
  code:
    symbols:
      \b\w+\b:
        count: 3
        multiplier: 0.2
    score: 1
multiplier: 0
1 -
https://github.com/user-attachments/assets/1af6065a-8ed0-4a61-a0…
0
content:
  p:
    symbols:
      \b\w+\b:
        count: 11
        multiplier: 0.2
    score: 1
multiplier: 0
1 -
- If there are multiple comments simultaneously they might creat…
0
content:
  ul:
    symbols:
      \b\w+\b:
        count: 70
        multiplier: 0.2
    score: 1
  li:
    symbols:
      \b\w+\b:
        count: 18
        multiplier: 0.2
    score: 1
multiplier: 0
1 -
- I think there should be a third option apart from workers and …
0
content:
  ul:
    symbols:
      \b\w+\b:
        count: 48
        multiplier: 0.2
    score: 1
  li:
    symbols:
      \b\w+\b:
        count: 24
        multiplier: 0.2
    score: 1
multiplier: 0
1 -
@gentlementlegen Removed `.ubiquibot-config.yml`. Change…
0
content:
  p:
    symbols:
      \b\w+\b:
        count: 10
        multiplier: 0.2
    score: 1
  code:
    symbols:
      \b\w+\b:
        count: 2
        multiplier: 0.2
    score: 1
multiplier: 0
1 -

[ 76.88 WXDAI ]

@0x4007
Contributions Overview
View Contribution Count Reward
Issue Specification 1 18.9
Issue Comment 5 12.38
Review Comment 26 45.6
Conversation Incentives
Comment Formatting Relevance Reward
- This allows the bot to learn and understand from all of the co…
18.9
content:
  ul:
    symbols:
      \b\w+\b:
        count: 100
        multiplier: 0.1
    score: 0
  li:
    symbols:
      \b\w+\b:
        count: 53
        multiplier: 0.1
    score: 1
  code:
    symbols:
      \b\w+\b:
        count: 2
        multiplier: 0.1
    score: 5
multiplier: 3
1 18.9
@sshivaditya2019 supabase has first class support for vector emb…
7.6
content:
  p:
    symbols:
      \b\w+\b:
        count: 38
        multiplier: 0.2
    score: 1
multiplier: 1
0.5 3.8
I would make the row ID the comment id. You can also store the i…
6.8
content:
  p:
    symbols:
      \b\w+\b:
        count: 34
        multiplier: 0.2
    score: 1
multiplier: 1
0.9 6.12
@Keyrxng do you have any resources for how @sshivaditya2019 can…
3.2
content:
  p:
    symbols:
      \b\w+\b:
        count: 16
        multiplier: 0.2
    score: 1
multiplier: 1
0.2 0.64
Just issue body and comments. Same table. IDs shouldn't collide.
2.2
content:
  p:
    symbols:
      \b\w+\b:
        count: 11
        multiplier: 0.2
    score: 1
multiplier: 1
0.7 1.54
I'm not on my computer so they can fork from you for now
2.8
content:
  p:
    symbols:
      \b\w+\b:
        count: 14
        multiplier: 0.2
    score: 1
multiplier: 1
0.1 0.28
This pull looks great. I just have some cosmetic changes
1
content:
  p:
    symbols:
      \b\w+\b:
        count: 10
        multiplier: 0.1
    score: 1
multiplier: 1
1 1
Delete
0.1
content:
  p:
    symbols:
      \b\w+\b:
        count: 1
        multiplier: 0.1
    score: 1
multiplier: 1
1 0.1
Why?
0.1
content:
  p:
    symbols:
      \b\w+\b:
        count: 1
        multiplier: 0.1
    score: 1
multiplier: 1
1 0.1
Why?
0.1
content:
  p:
    symbols:
      \b\w+\b:
        count: 1
        multiplier: 0.1
    score: 1
multiplier: 1
1 0.1
Is this the best model?
0.5
content:
  p:
    symbols:
      \b\w+\b:
        count: 5
        multiplier: 0.1
    score: 1
multiplier: 1
1 0.5
Maybe remove before we merge ```suggestion ` …
0.6
content:
  p:
    symbols:
      \b\w+\b:
        count: 5
        multiplier: 0.1
    score: 1
  pre:
    symbols:
      \b\w+\b:
        count: 1
        multiplier: 0.1
    score: 0
  code:
    symbols:
      \b\w+\b:
        count: 1
        multiplier: 0.1
    score: 1
multiplier: 1
1 0.6
Prefer to check and throw error if key is empty first.
1.1
content:
  p:
    symbols:
      \b\w+\b:
        count: 11
        multiplier: 0.1
    score: 1
multiplier: 1
1 1.1
Oh perhaps this throws the error if it's empty
1
content:
  p:
    symbols:
      \b\w+\b:
        count: 10
        multiplier: 0.1
    score: 1
multiplier: 1
1 1
@gentlementlegen rfc
0.2
content:
  p:
    symbols:
      \b\w+\b:
        count: 2
        multiplier: 0.1
    score: 1
multiplier: 1
1 0.2
Why not the large model, cost aside?
0.7
content:
  p:
    symbols:
      \b\w+\b:
        count: 7
        multiplier: 0.1
    score: 1
multiplier: 1
1 0.7
@gentlementlegen rfc
0.2
content:
  p:
    symbols:
      \b\w+\b:
        count: 2
        multiplier: 0.1
    score: 1
multiplier: 1
1 0.2
I think we should optimize as needed instead of proactively. Thi…
9.4
content:
  p:
    symbols:
      \b\w+\b:
        count: 94
        multiplier: 0.1
    score: 1
multiplier: 1
1 9.4
I think we should prefix all with `uos-` because ubiquib…
1.3
content:
  p:
    symbols:
      \b\w+\b:
        count: 12
        multiplier: 0.1
    score: 1
  code:
    symbols:
      \b\w+\b:
        count: 1
        multiplier: 0.1
    score: 1
multiplier: 1
1 1.3
I wonder if it makes sense to also rename the repositories to ma…
1.3
content:
  p:
    symbols:
      \b\w+\b:
        count: 13
        multiplier: 0.1
    score: 1
multiplier: 1
1 1.3
Generates vector embeddings of GitHub comments and stores them i…
1.1
content:
  p:
    symbols:
      \b\w+\b:
        count: 11
        multiplier: 0.1
    score: 1
multiplier: 1
1 1.1
```suggestion name: "@ubiquity-os/comment-vector-emb…
14.8
content:
  pre:
    symbols:
      \b\w+\b:
        count: 6
        multiplier: 0.1
    score: 0
  code:
    symbols:
      \b\w+\b:
        count: 6
        multiplier: 0.1
    score: 1
  ul:
    symbols:
      \b\w+\b:
        count: 76
        multiplier: 0.1
    score: 1
  li:
    symbols:
      \b\w+\b:
        count: 66
        multiplier: 0.1
    score: 1
multiplier: 1
1 14.8
Can't we do `"issue_comment"` to be concise? @gentlement…
1
content:
  p:
    symbols:
      \b\w+\b:
        count: 9
        multiplier: 0.1
    score: 1
  code:
    symbols:
      \b\w+\b:
        count: 1
        multiplier: 0.1
    score: 1
multiplier: 1
1 1
ubiquity-os
0.2
content:
  p:
    symbols:
      \b\w+\b:
        count: 2
        multiplier: 0.1
    score: 1
multiplier: 1
1 0.2
Possibly `issue_comment` only
0.4
content:
  p:
    symbols:
      \b\w+\b:
        count: 3
        multiplier: 0.1
    score: 1
  code:
    symbols:
      \b\w+\b:
        count: 1
        multiplier: 0.1
    score: 1
multiplier: 1
1 0.4
I think running as worker is fine.
0.7
content:
  p:
    symbols:
      \b\w+\b:
        count: 7
        multiplier: 0.1
    score: 1
multiplier: 1
1 0.7
UbiquiBot term is deprecated in favor of UbiquityOS (or when low…
1.3
content:
  p:
    symbols:
      \b\w+\b:
        count: 13
        multiplier: 0.1
    score: 1
multiplier: 1
1 1.3
Fix `Resolves #32` by placing the full URL of the issue …
1.8
content:
  p:
    symbols:
      \b\w+\b:
        count: 16
        multiplier: 0.1
    score: 1
  code:
    symbols:
      \b\w+\b:
        count: 2
        multiplier: 0.1
    score: 1
multiplier: 1
1 1.8
Sure new database is fine to keep things simple.
0.9
content:
  p:
    symbols:
      \b\w+\b:
        count: 9
        multiplier: 0.1
    score: 1
multiplier: 1
1 0.9
Very lovely QA testing video thank you.
0.7
content:
  p:
    symbols:
      \b\w+\b:
        count: 7
        multiplier: 0.1
    score: 1
multiplier: 1
1 0.7
GitHub actions are totally free for us. Workers can be costly. …
2.5
content:
  p:
    symbols:
      \b\w+\b:
        count: 25
        multiplier: 0.1
    score: 1
multiplier: 1
1 2.5
I think create and we can transfer later? I don't have a lot of …
2.6
content:
  p:
    symbols:
      \b\w+\b:
        count: 26
        multiplier: 0.1
    score: 1
multiplier: 1
1 2.6

[ 6.6 WXDAI ]

@Keyrxng
Contributions Overview
View Contribution Count Reward
Issue Comment 2 6.6
Conversation Incentives
Comment Formatting Relevance Reward
1. https://github.com/ubiquibot/plugin-template - It contains a …
8
content:
  ol:
    symbols:
      \b\w+\b:
        count: 54
        multiplier: 0.1
    score: 0
  li:
    symbols:
      \b\w+\b:
        count: 17
        multiplier: 0.1
    score: 1
  a:
    symbols:
      \b\w+\b:
        count: 1
        multiplier: 0.1
    score: 1
  code:
    symbols:
      \b\w+\b:
        count: 2
        multiplier: 0.1
    score: 1
  hr:
    symbols:
      \b\w+\b:
        count: 1
        multiplier: 0.1
    score: 0
  p:
    symbols:
      \b\w+\b:
        count: 60
        multiplier: 0.1
    score: 1
multiplier: 1
0.6 4.8
@sshivaditya2019 you can fork this repo - https://github.com/ub…
6
content:
  p:
    symbols:
      \b\w+\b:
        count: 6
        multiplier: 0.1
    score: 1
  ul:
    symbols:
      \b\w+\b:
        count: 27
        multiplier: 0.1
    score: 1
  li:
    symbols:
      \b\w+\b:
        count: 27
        multiplier: 0.1
    score: 1
multiplier: 1
0.3 1.8

[ 94.8 WXDAI ]

@gentlementlegen
Contributions Overview
View Contribution Count Reward
Review Comment 29 94.8
Conversation Incentives
Comment Formatting Relevance Reward
I had a horrendous time trying this locally and had to rely on a…
12.7
content:
  p:
    symbols:
      \b\w+\b:
        count: 25
        multiplier: 0.1
    score: 1
  ul:
    symbols:
      \b\w+\b:
        count: 75
        multiplier: 0.1
    score: 1
  li:
    symbols:
      \b\w+\b:
        count: 25
        multiplier: 0.1
    score: 1
  code:
    symbols:
      \b\w+\b:
        count: 2
        multiplier: 0.1
    score: 1
multiplier: 1
1 12.7
Some more changes might be needed to be able to run this one, bu…
6.4
content:
  h2:
    symbols:
      \b\w+\b:
        count: 21
        multiplier: 0.1
    score: 1
  p:
    symbols:
      \b\w+\b:
        count: 43
        multiplier: 0.1
    score: 1
multiplier: 1
1 6.4
Good with me. @0x4007 if you can create the Supabase instance an…
3
content:
  p:
    symbols:
      \b\w+\b:
        count: 30
        multiplier: 0.1
    score: 1
multiplier: 1
1 3
You can also cast the object to the proper type, I believe it gi…
1.7
content:
  p:
    symbols:
      \b\w+\b:
        count: 17
        multiplier: 0.1
    score: 1
multiplier: 1
1 1.7
```suggestion "name": "@ubiquibot/issue-comment-e…
0.5
content:
  pre:
    symbols:
      \b\w+\b:
        count: 5
        multiplier: 0.1
    score: 0
  code:
    symbols:
      \b\w+\b:
        count: 5
        multiplier: 0.1
    score: 1
multiplier: 1
1 0.5
Don't you have anything running already on port 4000? What error…
1.5
content:
  p:
    symbols:
      \b\w+\b:
        count: 15
        multiplier: 0.1
    score: 1
multiplier: 1
1 1.5
```suggestion name = "ubiquibot-issue-comment-embed…
0.5
content:
  pre:
    symbols:
      \b\w+\b:
        count: 5
        multiplier: 0.1
    score: 0
  code:
    symbols:
      \b\w+\b:
        count: 5
        multiplier: 0.1
    score: 1
multiplier: 1
1 0.5
I guess this can be part of a batch rename of all the projects.
1.4
content:
  p:
    symbols:
      \b\w+\b:
        count: 14
        multiplier: 0.1
    score: 1
multiplier: 1
1 1.4
We could consider having this parameter in the configuration lat…
1.8
content:
  p:
    symbols:
      \b\w+\b:
        count: 18
        multiplier: 0.1
    score: 1
multiplier: 1
1 1.8
I guess this would be reset every run. If we run twice the logic…
2.4
content:
  p:
    symbols:
      \b\w+\b:
        count: 24
        multiplier: 0.1
    score: 1
multiplier: 1
1 2.4
During the run it is fine, but all the results are pushed to sup…
2.7
content:
  p:
    symbols:
      \b\w+\b:
        count: 27
        multiplier: 0.1
    score: 1
multiplier: 1
1 2.7
Since this is auto-generated I think you'd be better setting the…
2
content:
  p:
    symbols:
      \b\w+\b:
        count: 20
        multiplier: 0.1
    score: 1
multiplier: 1
1 2
```suggestion "supabase": "1.191.3", ``…
0.4
content:
  pre:
    symbols:
      \b\w+\b:
        count: 4
        multiplier: 0.1
    score: 0
  code:
    symbols:
      \b\w+\b:
        count: 4
        multiplier: 0.1
    score: 1
multiplier: 1
1 0.4
This should be removed, otherwise it will appear in the `/he…
1.3
content:
  p:
    symbols:
      \b\w+\b:
        count: 12
        multiplier: 0.1
    score: 1
  code:
    symbols:
      \b\w+\b:
        count: 1
        multiplier: 0.1
    score: 1
multiplier: 1
1 1.3
Please change `name` and `description` properly.
0.7
content:
  p:
    symbols:
      \b\w+\b:
        count: 6
        multiplier: 0.1
    score: 1
  code:
    symbols:
      \b\w+\b:
        count: 1
        multiplier: 0.1
    score: 1
multiplier: 1
1 0.7
Should not be included.
0.4
content:
  p:
    symbols:
      \b\w+\b:
        count: 4
        multiplier: 0.1
    score: 1
multiplier: 1
1 0.4
Should not be deleted.
0.4
content:
  p:
    symbols:
      \b\w+\b:
        count: 4
        multiplier: 0.1
    score: 1
multiplier: 1
1 0.4
@0x4007 Probably a description like `Created vector embeddin…
2.6
content:
  p:
    symbols:
      \b\w+\b:
        count: 18
        multiplier: 0.1
    score: 1
  code:
    symbols:
      \b\w+\b:
        count: 8
        multiplier: 0.1
    score: 1
multiplier: 1
1 2.6
Why do you set this variable?
0.6
content:
  p:
    symbols:
      \b\w+\b:
        count: 6
        multiplier: 0.1
    score: 1
multiplier: 1
1 0.6
What is this for?
0.4
content:
  p:
    symbols:
      \b\w+\b:
        count: 4
        multiplier: 0.1
    score: 1
multiplier: 1
1 0.4
The kernel uses it to track the status of the plugins, but withi…
1.9
content:
  p:
    symbols:
      \b\w+\b:
        count: 19
        multiplier: 0.1
    score: 1
multiplier: 1
1 1.9
This will always crash on run: https://github.com/Meniole/issue…
7.1
content:
  p:
    symbols:
      \b\w+\b:
        count: 69
        multiplier: 0.1
    score: 1
  code:
    symbols:
      \b\w+\b:
        count: 2
        multiplier: 0.1
    score: 1
multiplier: 1
1 7.1
Not with the current implementation because it wouldn't match th…
1.7
content:
  p:
    symbols:
      \b\w+\b:
        count: 17
        multiplier: 0.1
    score: 1
multiplier: 1
1 1.7
This file should be removed, otherwise no plugin will run inside…
1.4
content:
  p:
    symbols:
      \b\w+\b:
        count: 14
        multiplier: 0.1
    score: 1
multiplier: 1
1 1.4
@sshivaditya2019 It seems that you are storing the vectors withi…
3.6
content:
  p:
    symbols:
      \b\w+\b:
        count: 36
        multiplier: 0.1
    score: 1
multiplier: 1
1 3.6
@sshivaditya2019 Thanks a lot for the changes, appreciated. Sinc…
19.8
content:
  p:
    symbols:
      \b\w+\b:
        count: 54
        multiplier: 0.1
    score: 1
  ul:
    symbols:
      \b\w+\b:
        count: 72
        multiplier: 0.1
    score: 1
  li:
    symbols:
      \b\w+\b:
        count: 70
        multiplier: 0.1
    score: 1
  code:
    symbols:
      \b\w+\b:
        count: 2
        multiplier: 0.1
    score: 1
multiplier: 1
1 19.8
Please fix the Knip issues, I will try to set up this on my repo…
2
content:
  p:
    symbols:
      \b\w+\b:
        count: 20
        multiplier: 0.1
    score: 1
multiplier: 1
1 2
@0x4007 My concern about having this as a worker is the followin…
9.2
content:
  p:
    symbols:
      \b\w+\b:
        count: 12
        multiplier: 0.1
    score: 1
  ul:
    symbols:
      \b\w+\b:
        count: 61
        multiplier: 0.1
    score: 1
  li:
    symbols:
      \b\w+\b:
        count: 19
        multiplier: 0.1
    score: 1
multiplier: 1
1 9.2
Tested thsi on my Cloudflare, works fine, thanks for all the bug…
4.7
content:
  p:
    symbols:
      \b\w+\b:
        count: 44
        multiplier: 0.1
    score: 1
  code:
    symbols:
      \b\w+\b:
        count: 3
        multiplier: 0.1
    score: 1
multiplier: 1
1 4.7
0x4007 commented 3 weeks ago

@Keyrxng i think this was eligible for the additional Dora hacks incentive. Would you be able to handle this?

Keyrxng commented 3 weeks ago

@sshivaditya2019 If you can enter a submission to dorahacks with a link to this issue and your github username please I can process it

0x4007 commented 3 weeks ago

I just realized the original plan was to use cloudflare free embeddings. We can make this cost optimization later but @sshivaditya2019 is there any specific reason why you must use open ai embeddings?

sshivaditya2019 commented 3 weeks ago

I just realized the original plan was to use cloudflare free embeddings. We can make this cost optimization later but @sshivaditya2019 is there any specific reason why you must use open ai embeddings?

Open ai embeddings are better for designing future AI applications using the data. As far as I know Cloudflare embeddings are restricted to the paid plan.

Let me know if this is a blocker. Can try to work on some alternative?

sshivaditya2019 commented 3 weeks ago

@Keyrxng have created the submission on Dora hacks.

0x4007 commented 3 weeks ago

Open ai embeddings are better for designing future AI applications using the data.

Why

sshivaditya2019 commented 3 weeks ago

Open ai embeddings are better for designing future AI applications using the data.

Why

One, leaderboard, wise text-embedding-3-large is better than bge-large-en-v1.5. Two, from what I have used Context retrieval is better in Open AI Model.

Keyrxng commented 3 weeks ago

@sshivaditya2019

https://polygonscan.com/tx/0x93d290e768a53f4a14bac7ccf171dba61b7a2081d69a702213b665613abc9bfb

gentlementlegen commented 3 weeks ago

I created the DB and updated secrets accordingly. The RLS on the DB has not been set up inside the migrations however, opened a ticket for that.

0x4007 commented 3 weeks ago

Open ai embeddings are better for designing future AI applications using the data.

Why

One, leaderboard, wise text-embedding-3-large is better than bge-large-en-v1.5. Two, from what I have used Context retrieval is better in Open AI Model.

bge-large-en-v1.5 is rank 30.

text-embedding-3-large is rank 309

We should be using the highest quality and cheapest models.

sshivaditya2019 commented 3 weeks ago

Open ai embeddings are better for designing future AI applications using the data.

Why

One, leaderboard, wise text-embedding-3-large is better than bge-large-en-v1.5. Two, from what I have used Context retrieval is better in Open AI Model.

bge-large-en-v1.5 is rank 30.

text-embedding-3-large is rank 309

We should be using the highest quality and cheapest models.

The leaderboard, does not have all components filled text-embedding-3-large has a average MTEB average of 64.6 and bge-large-en-v1.5 has 64.23. If sort by retrieval score text-embedding-3-large should be better.

Also, Cloudflare Worker AI platform is not GA yet.

Keyrxng commented 3 weeks ago

@gentlementlegen I've hijacked the deployment temporarily

0x4007 commented 2 weeks ago

I filed a new issue because I realize that you didn't save the issue bodies as we agreed.

Please prioritize handling it @sshivaditya2019

sshivaditya2019 commented 2 weeks ago

I filed a new issue because I realize that you didn't save the issue bodies as we agreed.

Please prioritize handling it @sshivaditya2019

As far as I know issuebodies are stored as per the schema. The issue text is stored in 'issuebody' column.

0x4007 commented 2 weeks ago

@Keyrxng set up their own database and ran this but couldn't find any issues saved

sshivaditya2019 commented 2 weeks ago

@Keyrxng set up their own database and ran this but couldn't find any issues saved

In #8, the image in the issue spec, has the issue bodies. Just to be clear by issue body you are referring to the issue's text body right ?

0x4007 commented 2 weeks ago

Sorry I meant that they need the ID of the issue and the issue body alone. You focused the schema around the comments and their IDs.