hotsh / rstat.us

Simple microblogging network based on the ostatus protocol.
http://rstat.us/
Other
722 stars 215 forks source link

Update in my feed that I didn't make #458

Closed carols10cents closed 12 years ago

carols10cents commented 12 years ago

There is an update showing up in http://rstat.us/users/Carols10cents that I did not make (http://rstat.us/updates/4f1ada8adecb350001000a77). It appears to have come from reality@identi.ca: it's in their feed here: https://identi.ca/api/statuses/user_timeline/55937.atom?max_id=89098925 and this is the conversation view of the original status: https://identi.ca/conversation/88893235#notice-89057548.atom

I know @markzalar has seen this before too but I thought at the time that it had something to do with the messed up usernames containing dots. This seems to not be the case.

I'm trying to trace how this could have happened but I really don't know what I'm looking for. I suspect it's some sort of collision with IDs across networks or something, but I don't see any IDs in common between my Author, reality's rstat.us Author, reality's identi.ca info, our Feed ids, etc.

I have no idea how to go about trying to reproduce this.

I have no idea if this is an rstat.us bug, an ostatus gem bug, an identi.ca bug, a status.net bug, a hub bug, a salmon bug, etc etc etc.

@wilkie or @steveklabnik, do you have any ideas on where to look or what to look for?

carols10cents commented 12 years ago

Ok, here's what I've found out so far:

I started tracing what happens when we get a Salmon slap (since the tweet in question seems to have come from identi.ca)

We call populate_entries and pass the <entry> from the Salmon slap: SalmonController:114 author.feed.populate_entries [atom_entry]

populate_entries checks to see if this entry exists already: Feed:93 u = Update.first(:url => entry.url)

entry is an OStatus::Entry, and its url is: From https://github.com/hotsh/ostatus/blob/master/lib/ostatus/entry.rb#L50

def url
  if links.alternate
    links.alternate.href

From https://identi.ca/api/statuses/user_timeline/55937.atom?max_id=89098925, the XML for the RT entry has:

  <link rel="alternate" type="text/html">http://identi.ca/notice/89057569</link>

So links.alternate would get this link, but this link's href would be nil.

So that means that Feed:93 does Update.first(:url => nil).

The other thing that's troubling me is that url is not a key of Update, there's just a def url and a def url=.

So I don't think this query is even valid....?

It seems like ALL Updates have :url=> nil if I do this:

Update.count(:url => nil)
=> 36818

Update.count
=> 36818

:( :( :( :fire: :scissors:

Soooo I got what Feed:93 would get currently:

  Update.first(:url => nil)
  => #<Update
       _id: BSON::ObjectId('4ee8b2d6fb54c6000101f60d'),
       author_id: BSON::ObjectId('4e87b1a51bdece449200127a'),
       created_at: Wed, 14 Dec 2011 14:28:17 UTC +00:00,
       updated_at: Sun, 29 Jan 2012 10:41:37 UTC +00:00,
       feed_id: BSON::ObjectId('4e87b1a41bdece4492001278'),
       mention_ids: [],
       remote_url: "http://identi.ca/notice/86890968",
       tags: ["twitter"],
       text: "RT @schestowitz \"Private\" messages are not really private, not even to a foreign country http://ur1.ca/7xmmt #twitter"
      >

Notice how the created_at is Dec 14 and the updated_at is Jan 29. That seems suspicious.

I looked up Author with BSON::ObjectId('4e87b1a51bdece449200127a'):

username: fontana remote_url: http://identi.ca/user/40758

And I don't see any update in that feed matching the RT text from jan 29.

If querying on url worked, since url is defined as feed.local? ? "/updates/#{id}" : remote_url, I would expect to get back the Update I was just looking at if I use its remote_url:

Update.first(:url => "http://identi.ca/notice/86890968")
=> nil

Sooooo that sure looks like that query isn't working the way we'd expect.

Knowing the above, I looked at the Update that's showing up in my stream:

  #<Update
    _id: BSON::ObjectId('4f1ada8adecb350001000a77'),
    author_id: BSON::ObjectId('4d8b420fc26df84a8900061e'),
    created_at: Sat, 21 Jan 2012 15:32:26 UTC +00:00,
    updated_at: Mon, 23 Jan 2012 03:03:22 UTC +00:00,
    feed_id: BSON::ObjectId('4d8b4210c26df84a89000620'),
    mention_ids: [BSON::ObjectId('4e7910591bdece0cef0071ba')],
    referral_id: "",
    tags: [],
    text: "RT @psquid I think I’ll move to using char[][], just to annoy @speeddefrost.",
    twitter: false
  >

A created_at that's 2 days before the updated_at. Which would also explain why in rstatus this shows up as being before the original status on identica, since we're showing the created_at (which was a different status when it was created!!!)

Questions:

  1. Do any OStatus microblogging services allow editing of statuses? That's the use case I could see for looking up an existing Update and modifying the content. If none allow editing, and if there are no other use cases for what Feed:93 is doing, I propose we change that to not do anything if the Update already exists.
  2. Original (aka not RT/<activity:verb>share</activity:verb>) statuses from identi.ca have:

    <link rel="alternate" type="text/html" href="http://identi.ca/notice/89057404"/>

    So I suspect the "real" bug is that identi.ca shares have invalid <link>s (http://www.atomenabled.org/developers/syndication/atom-format-spec.php#element.link says atom:link elements MUST have an href attribute)

    I will look for a bug like this in identi.ca/status.net's issue tracker and file one if it doesn't exist.

  3. But in the meantime, to be able to handle this, should the fix to look for the text of the <link> if it doesn't have an @href go in:
    • rstat.us
    • the ostatus gem
    • the ratom gem
    • all of the above
    • none of the above
  4. It looks to me like Update.first(:url => whatever) is an invalid query to be doing. Does anyone agree or disagree? Am I just misunderstanding how rails+mongomapper works?
  5. Or am I barking up the wrong tree on all of this????
carols10cents commented 12 years ago

Filed a bug with status.net

carols10cents commented 12 years ago

According to http://activitystrea.ms/head/activity-schema.html#verbs there is an "update" verb:

The "update" verb indicates that the actor has modified the object. Implementors SHOULD use verbs such as post where the actor is adding new items to a collection or similar. Update is reserved for modifications to existing objects or data such as changing a user's profile information.

So to me this sounds like only when we get an activity:verb update should we be modifying the text of an Update that exists already.

Valarissa commented 12 years ago

Maybe I should have kept some of the posts I put here :S My train of thought was along the same line, that a status that was posted previously, by you, was being updated, but because I didn't have a way to check the db where this had taken place, I had no way to verify a theory or if I was being a crackpot. This is why I mentioned the differing update posting times...

I think, as an extra precaution, we should probably look to verifying the author against the found status' author when performing an update. (i.e. this particular scenario would have checked to see if a username of reality would have matched the username of carols10cents... which it wouldn't have and should probably throw an access denied/403 error.

Just my thoughts.

wilkie commented 12 years ago

Interesting.

From https://identi.ca/api/statuses/user_timeline/55937.atom?max_id=89098925, the XML for the RT entry has:

<link rel="alternate" type="text/html">http://identi.ca/notice/89057569</link>

So links.alternate would get this link, but this link's href would be nil.

GRRRRR. Eff that. Alright. Consumers of information should be lenient, right? That's an OStatus gem bug. https://github.com/hotsh/ostatus/issues/4

  1. Do any OStatus microblogging services allow editing of statuses? That's the use case I could see for looking up an existing Update and modifying the content. If none allow editing, and if there are no other use cases for what Feed:93 is doing, I propose we change that to not do anything if the Update already exists.

No. OStatus does, though. So we could expect seeing that behavior from a hypothetical node. We can verify the author of a status and all updates are universally identified by URI, so it's not difficult. We just don't handle it well and may just disallow it locally.

2/3. [...] So I suspect the "real" bug is that identi.ca shares have invalid s (http://www.atomenabled.org/developers/syndication/atom-format-spec.php#element.link says atom:link elements MUST have an href attribute) [...]

True. You filed a bug on identi.ca for invalid generation... I filed a bug on ostatus gem for not being lenient.

  1. It looks to me like Update.first(:url => whatever) is an invalid query to be doing. Does anyone agree or disagree? Am I just misunderstanding how rails+mongomapper works?

Yeah. That is an invalid query. The url field was destroyed. The bigger issue is the lack of tests for those requests. That is the thing I will work toward this week. I have some code somewhere...

  1. Or am I barking up the wrong tree on all of this????

Nope. Correct tree. Woof.

wilkie commented 12 years ago

@Valarissa writes:

I think, as an extra precaution, we should probably look to verifying the author against the found status' author when performing an update. (i.e. this particular scenario would have checked to see if a username of reality would have matched the username of carols10cents... which it wouldn't have and should probably throw an access denied/403 error.

We verify authorship of salmon notifications, currently, because they are signed. OStatus does not have good validation. Spoof a hub or use a malicious hub, and you can author updates as another person. There is validation of the hub sending the request... so it should be ok. As such, OStatus is a little broken in this regard...

As for performing the check just because... well... no! Don't do that! :) If you expect that an update not overwrite another one in a particular case, don't write code to catch that case in the implementation... write a test. For the record, you compare author URIs against one another, not usernames. ;)