Closed RydalWater closed 1 month ago
Is this similar to https://github.com/nostr-protocol/nips/pull/879?
Is this similar to #879?
Yeah definitely some substantial overlap. I was thinking about the problem more from the client side trying to get it to be easy to find and maintain reviews about given topics and how to leverage existing tag to provide a variety of data types when providing the review (so they can be attached to those objects more directly).
The proposal for NIP-85 looks like it wants to give more graduated structural reviews like (customer service, cleanliness etc.) but this very general model makes it difficult to standardize a review across many clients as the number of rating
values may not be consistent with simple overall rating review (as I've proposed here). I am not in any way saying one method is better though because a review with a breakdown of ratings can also be extremely valuable.
@staab realizing there is crossover I'd like to get your thoughts on this proposal. I know you're goal with the solution you provided was very simple but I wonder if it is maybe too simple for some cases. Or perhaps that both proposals have value?
type
is the same as the l
tag, except it's numeric which makes it harder to read and the NIP harder to maintain. Nested namespaces are only useful if the sub-namespace is user-generated. So we should either add a new top-level kind
for each review type, or use arbitrary l
tags.unit
and denom
unnecessarily partition review data sets. In other words, how do you average a rating of 3/5 stars and 2/10 chef's-kiss-hands? You would normalize them to percents, but then some semantics have been lost. It's probably best to just normalize to a decimal/percent like #879 does, and clients can show whatever granularity or emojis they want to.32020
would fit neatly into NIP 51, I have no problem with these events.Just updated #879 to what my preference would be for reviews. I don't mention the list or additional tags per review type, but those could be added.
Also, you can ignore my comment about backwards compatibility, I don't think it's actually important. There are kind 1985 and 1986 reviews with different formats, both published by coracle, and a few kind 1986 published under a qts
namespace, looks like maybe for a freelancing product. I only found 86 events of either kind, so we should just pick a new kind and move on.
I see now, yes you're right I was trying to use the type
tag as a way to provide a pointer for the client as to what the subsequent tag will look like (basically though as you say that is a replication of the kind
behavior and is likely not the cleanest way to handle it). The general idea here was to attach the review to real things (not just nostr events), as in your example we want to be able to connect a review to a company, product, URL, podcast, movie, venue etc.. In your proposal would we just put all of these related items into a an i
tag?
For general reviews I think there are a few problem statements we need to resolve:
Your solution I think solves the first, though I think it could still be extended to cover the others without much of an issue. The list part of this proposal is independent but helps solve the second problem which is extremely valuable as it can help clients avoid presenting outdated reviews without needing to go looking for this information (it puts the user in control of what they are putting their name next to). For problem 3 there is almost no reason to not extend the rating
kind to include the denominator since we'd be adding a tag anyway and the additional tag can be optional.
I've added a comment/suggested up to #879 and I'll work on a PR for the list proposal, changing the kind to 31987 for clear linkage with your rating proposed kind.
I think having a review kind is a great addition to nostr. I like the simpler version from @staab in https://github.com/nostr-protocol/nips/pull/879 but also think a list of reviews is a great additon.
In your proposal would we just put all of these related items into a an i tag?
Yep, i
is the right way to handle things external to nostr. I think for reviews it would be worthwhile to use i
even to review things within nostr, e.g. e:<event-id>
or p:<pubkey>
to avoid collisions on regular mentions/parents/quotes.
I must have a way to change or replace a reviews in case my opinion of an object changes
We could use replaceables instead, why not. The d
tag could then be the target of the review, clearing up my previous comment.
I should have a way to normalize the rating data across multiple platforms therefore the (maximum value for a rating scale SHOULD/COULD be provided to facilitate this).
Using a value from zero to one would do this, with unlimited granularity, right?
I've gone and updated #879 to include a NIP 51 list, and use replaceable events.
Great, I think it is coming together nicely.
One thought is, why restrict kind 31987
to relays specifically? Shouldn't we just go with the following structure:
Review event structure:
d
(can be anything effectively, used for replacing/coordination)i
(object under review)rating
1-ndenominator
**I realise I keep coming back to this but it is valuable I think to keep away from the protocol requiring that all clients conform to a 0-100 scale for ratings. I agree you can get back to 0-100 from rating/denom but you shouldn't have to. You should be able to capture the value as it is entered not as the protocol wants to see it. It makes the protocol too opinionated I think about what a rating looks like.
One thought is, why restrict kind 31987 to relays specifically?
Additional tags might make sense in some circumstances and not others. We could potentially use a single kind, and show a different UI based on d
tag. But in other NIPs (like #1043) it has seemed to make more sense to use a different kind for each different thing.
You should be able to capture the value as it is entered not as the protocol wants to see it.
I'm unconvinced, but it would be fine to add a denominator
(or maybe scale
) tag to your client if you prefer. If enough people adopt it, we can add it to the NIP. I understand what you're going for, I just think it would break interoperability if clients had to use the original denominator when displaying reviews. In reality, clients will just normalize all reviews, which is harder when using different denominators.
Maybe a different way to solve this would be something like a user_rating
tag, which would contain the rating "text", for example some emojis or the text "3/5" or something. That way clients can normalize, but always show the original intent.
Additional tags might make sense in some circumstances and not others. We could potentially use a single kind, and show a different UI based on
d
tag. But in other NIPs (like #1043) it has seemed to make more sense to use a different kind for each different thing.
Fair enough, it just feels redundant to start with multiple kinds when a general kind may give us a broad use case and then if specialized use cases emerge those can then deviate from the general to create bespoke needs. I am not against specific kinds, I just think it is more work to create new kinds for each unique use case, right?
I'm unconvinced, but it would be fine to add a
denominator
(or maybescale
) tag to your client if you prefer. If enough people adopt it, we can add it to the NIP. I understand what you're going for, I just think it would break interoperability if clients had to use the original denominator when displaying reviews. In reality, clients will just normalize all reviews, which is harder when using different denominators.Maybe a different way to solve this would be something like a
user_rating
tag, which would contain the rating "text", for example some emojis or the text "3/5" or something. That way clients can normalize, but always show the original intent.
Happy to concede this ground, I am not so tied to it. I do like your suggestion of adding an optional general text field which could be used to convey ratings verbatim. Perhaps raw_rating
instead of user. A client could look for it or ignore it as they see fit.
Finally, I was thinking more broad and general about this problem and if even "ratings" is too specific. I wondered if perhaps this could be more simply defined as a "Score Kind". That is to say the kind is used to convey a score for something. Ratings, for example are a type of score. But then for questionaries or other similar things when you ask a user to grade something you wouldn't call them ratings per-se.
I realize this last thought really looks back at the concept of ratings and leans into my suggestion above of a single general purpose kind rather than specific kinds for specific use cases. What do you think?
Example:
{
"kind": 2020,
"tags": [
["d", "<some unique id>"],
["i", "<external ID for the item being scored>"],
["score", "0.8"],
["raw_score", "8/10"]
//.. repeat score/raw_score
],
"content": "<Some optional comments about the score provided>"
}
I wondered if perhaps this could be more simply defined as a "Score Kind".
See here for a draft of a more generic version of ratings I did a while back. This approach has mostly been rejected by the community. In nostr, more concrete is generally better.
Perhaps raw_rating instead of user.
I'll leave this out for now since I'm not sure how such a thing would best be designed, but feel free to add such a tag and we can spec it then.
I wondered if perhaps this could be more simply defined as a "Score Kind".
See here for a draft of a more generic version of ratings I did a while back. This approach has mostly been rejected by the community. In nostr, more concrete is generally better.
Thanks for this context it helps a lot.
Just to make sure I am clear on the next steps (tying to avoid overlap):
i
tag and them to be replaceable d
, though these could be one and the same), I guess I'll make a separate proposal for another NIP which has a specific use-case (and kind) for the ratings I need along with an appropriate update the NIP51 to include a Ratings set?the current proposed NIP51 kind is generic
Now, this I think is probably ok. But I don't know what use cases you have in mind.
I guess I'll make a separate proposal for another NIP which has a specific use-case (and kind) for the ratings I need along with an appropriate update the NIP51 to include a Ratings set?
Sure, nothing wrong with a competing PR. I'm not really working on this right now, so for the foreseeable future coracle will be on the old reviews draft spec. My suggestion would be to go ahead and build your client, and publish your NIP afterwards, now that we've built some consensus on what reviews should look like. NIPs are meaningless except as a nexus for discussion until they've been implemented.
the current proposed NIP51 kind is generic
Now, this I think is probably ok. But I don't know what use cases you have in mind.
I'm looking to create list of review for books (hence the urge to include the i
tag), I was hoping to get this added before my first deployment but I've pulled the trigger and will add the reviews in the next release (https://github.com/RydalWater/OpenLibrarian). This is currently up in test mode so all cache local and no events published to relays.
The concern I had with your NIP51 proposal was that it currently ties the list type specifically to NIP85 events (which are specific). That said if you think it is just an update to include other NIPs that is fine. Could always update as something like this A list of rating events (e.g., [NIP 85](./85.md))
that would make it so we don't need to immediately list other rating types.
Sure, nothing wrong with a competing PR. I'm not really working on this right now, so for the foreseeable future coracle will be on the old reviews draft spec. My suggestion would be to go ahead and build your client, and publish your NIP afterwards, now that we've built some consensus on what reviews should look like. NIPs are meaningless except as a nexus for discussion until they've been implemented.
Really do appreciate the back and forth on this discussion, it has been extremely useful. I'll mull over my implementation for the next couple of weeks and then keep you posted on how it goes.
Will go ahead and close this thread later today.
About:
The ability to perform a review of products, services and/or other consumables is a critical way by which companies and product developers receive feedback informing them of their success and or failure. While social signals (e.g., follows, likes, comments etc.) do provide some real-time feedback these are inherently ephemeral in nature and do not always reflect the formal experience of an individual with regards to a given product.
Existing product review process focus on the aggregation of data in the hopes that bad actors will not have an outsized impact on the collective review scoring for a product. They relay on metrics such as total number of reviews, average ratings and ratios (good vs. bad). Nostr provides a unique method of leveraging social graphs to be able to surface spheres of influence relevant to users which allows for a fine tuned user experience when judging the appropriateness of a product for their needs.
While it is true with the current Nostr tools we can already achieve some targeted opinions by surfacing social signals, this isn't perfect as it doesn't express an explicit quantifiable opinion across users. One user's like maybe another's love, instead we should really be using more formal grading scales to firstly explicitly state the reviewer's opinion (in a way that can be aggregated easily), and we should also ideally have a method by which opinions can be changed or replaced over time. Maybe a company or product starts off good but then takes a nose dive, users of these products should reserve the right to change their opinion to reflect their current experience.
Proposal:
This proposal leverages a bunch of existing building blocks including a range of tags as specified in other NIPs along with the introduction of two new kinds
2020
for reviews themselves and32020
for parameterized replaceable events which would represent user/client defined review sets.Structure:
type
: helps categorize the content and provides pointer for clients of what identifying tag will follow see belowdenom
: the denominator for the ratingrating
: the rating value (numerator)unit
: optionally used tag to allow clients to determine the rating unite
tags refer back to valid/current review events (maintainable like a follow/mute lists).Example Details:
Type-Tag Mapping:
This table describes a range of type values which would inform the valid/expected tag helping the client to identifiy the item under review.
e
or,a
p
i
relay
location
or,g
web
Possible future tag ideas:
Comments:
The proposal allows individual review events to exist and be created on the fly and the user to maintain their own list of valid events, thereby giving them the ability to explicitly start that this review still reflects their current opinion on the subject. The downside of this method is that clients will need to go retrieve all valid reviews and parse them in order to determine which may be relevant to display.
Comments/feedback etc. very welcome. I've probably not done things the most 'nostr' way so very open to suggestions here.
The suggested new tags would also be useful outside of this specific case, for example
unit
would be handing for step tracking and other health app clients.