emilyhunt / bluesky-astronomy-feeds

Astronomy-themed feeds for Bluesky.
MIT License
5 stars 0 forks source link

Irrelevant words and phrases can cause posts to be added to feeds erroneously #6

Closed emilyhunt closed 1 year ago

emilyhunt commented 1 year ago

This is gonna be hell to fix without removing support for words adding posts to feeds automatically, but maybe some common ones could be solved. I'll keep a list here of all terms that have caused problems...

Astronomy

(List of words that can be mis-detected)

emilyhunt commented 1 year ago

Another option could be to require multiple relevant terms in one post - maybe even multiple different ones? The relevant terms list could be expanded. This would need testing to make sure relevant posts are still largely kept but that irrelevant ones are actually removed.

emilyhunt commented 1 year ago

Could use ML to classify posts. Might be a pain to get it to work nicely though, especially if more feeds are added.

emilyhunt commented 1 year ago

alternatively, I could look into Chat GPT API pricing and just use GPT to classify posts... that would be a wild ride.

Average skeet (~200 characters) might be around about ~50 tokens. (Output cost would be almost irrelevant - would be just "yes" or "no".) May be possible to use an older model (like GPT 3.5) and still get very good results but for much less price.

It would cost about $1.50 per 1000 skeets to use GPT 4 w/ 8k context, or about 1/20th that to use GPT 3.5 ($0.075 / 1000). It would also be such a cool little thing to do!

emilyhunt commented 1 year ago

Closing for now as trigger words were removed from the astronomy feed and won't be re-added for a while/at all.