bluesky-social / atproto

Social networking technology created by Bluesky
Other
5.77k stars 406 forks source link

Lexicon violation of the labeller policy #2569

Closed yamarten closed 2 weeks ago

yamarten commented 2 weeks ago

Describe the bug

app.bsky.labeler.getServices embeds a part of the service record directly in policies, but the responses may include fields that are not in the schema. For example, in the following link, it occurred due to a typo.

https://bsky.app/profile/sugyan.com/post/3ktx5i7y5jd2n

To Reproduce

Steps to reproduce the behavior:

  1. Create a service record with your own field.
  2. Run getService for the created record.

Expected behavior

In this case, perhaps unknown fields should be excluded from the response. (Or is it acceptable for appview to extend the schema?)

Details

Additional context

As another general solution, you can allow embedding by setting the type of policies to unknown, like app.bsky.feed.defs#postView. However, the type change after publishing breaks the rules of lexicon, so it cannot be used for getService.

bnewbold commented 2 weeks ago

Additional unexpected fields are allowed. The specs say:

Unexpected fields in data which otherwise conforms to the Lexicon should be ignored.

and, in best practices section:

Implementations which serialize and deserialize data from JSON or CBOR in to structures derived from specific Lexicons should be aware of the risk of "clobbering" unexpected fields.

At a minimum, it is important that software is resilient to new unexpected fields, both in records and endpoint responses.

yamarten commented 2 weeks ago

Thank you for your answer.

Additional unexpected fields are allowed. The specs say:

Unexpected fields in data which otherwise conforms to the Lexicon should be ignored.

and, in best practices section:

Implementations which serialize and deserialize data from JSON or CBOR in to structures derived from specific Lexicons should be aware of the risk of "clobbering" unexpected fields.

I see, it seems that I misunderstood this description as a specification only for records.

When embedding a record into responses, is there a recommended guideline on whether to define the type like this or to use the unknown type like postView? (ref. #999)

bnewbold commented 2 weeks ago

We don't have a "style guide" written up yet with advice on how to structure Lexicon schemas.

When wrapping/extending/embedding records as "views", I think today we would actually recommend doing it differently than we have in some situations like profileView in the bsky Lexicons. Today, I would strongly recommend putting the complete original records as a separate field, instead of re-defining the record plus additional fields all in the same object. The reason for this is to allow "passing through" the raw record with any "unexpected" fields in endpoint responses, exactly how it is stored in the repo. Other fields should indicate the specific AT-URI of the record, and the version (CID) of the record.