hayamiz / twittering-mode

An Emacs major mode for Twitter
http://twmode.sourceforge.net/
545 stars 92 forks source link

Tweet- and tweep-filtering functionality. #73

Open flexibeast opened 10 years ago

qdot commented 10 years ago

Just curious, does this completely remove the tweets for the list, or would there be a way to flip whether or not they're displayed?

flexibeast commented 10 years ago

@qdot:

If I understand your question correctly, then my code doesn't stop the tweets specified by the 'twittering-filter-users' and 'twittering-filter-tweets' lists from being retrieved and available to the current twittering-mode session; it simply stops them being displayed to the user. So, say one had the value of 'twittering-filter-users' set to '("example")'. Tweets from the 'example' account wouldn't be displayed to the user. But if during the current session, 'twittering-filter-users' was then set to '()', any tweets from the 'example' account that had been received during the current session could (should!) get displayed to the user after refreshing the relevant timelines.

Since by default the 'twittering-filter-users' and 'twittering-filter-tweets' variables are both set to '()', no filtering of displayed tweets is done by default, such that this code has no impact on existing users until they start setting these variables themselves.

Hope that helps?

qdot commented 10 years ago

Yup, that's exactly what I was wondering about (and the answer I was hoping for). Thanks!

kuanyui commented 10 years ago

I'm curious why it seems not to filter url. I want to filter the tweets containing urls like 4sq.com, adf.ly, but the filter seems can not being applied on them.

flexibeast commented 10 years ago

@kuanyui:

My code certainly doesn't try to treat URLs any differently to other tweet-filtering criteria; could you please post the code you're using to set the twittering-filter-tweets variable?

kuanyui commented 10 years ago
(setq twittering-filter-tweets '("4sq" "http://4sq.com/.*" "http://adf.ly/[A-z0-9]*"))

But they are all of no effects.

flexibeast commented 10 years ago

@kuanyui:

My code makes use of the 'string-match' function, so whatever patterns you provide need to work with that function. If I evaluate, say, '(string-match "http://4sq.com/.*" "http://4sq.com/test.html")' in my scratch buffer, the function returns 0, implying that the provided pattern isn't doing the job it was intended to do. On the other hand, if I evaluate '(string-match "4sq" "http://4sq.com/test.html")' in my scratch buffer, the function returns 7 - a non-zero value that is sufficient for my code to filter out tweets containing such a URL. So you'll need to do some research into how to provide a regexp to the string-match function that returns a non-zero value for the appropriate test data.

cvmat commented 10 years ago

I am sorry for replying this issue lately. It is because the tweet text is as it is received from the Twitter server, where URLs are shortened. You can see the text by evaluating (assq 'text (twittering-find-status (twittering-get-id-at))) on a tweet.

I think that you can solve the problem by replacing (cdr (assoc 'text status)) with (twittering-make-fontified-tweet-text-with-entity status) in the function twittering-filters-apply.

I understand there is a demand for filtering tweets. I will add the function that does not discard tweets matching a pattern, but hides them after coping with the Twitter Display Requirement https://dev.twitter.com/terms/display-requirements .

flexibeast commented 10 years ago

@kuanyui:

Thanks for that!

I've just done some investigation into this issue, and the problem seems to be that all URLs in Tweet text initially arrive as t.co links, which are subsequently expanded into the actual URL by the Twitter client. In other words, if someone writes a tweet saying "I'm at GitHub 4sq.com/xxxxx", this will actually initially arrive as something like "I'm at GitHub t.co/yyyyy". My code runs via twittering-new-tweets-hook, which gets run before t.co URLs are expanded into the actual URL for display. So your patterns will never match.

twittering-mode provides a display-url key in the alist for each status, so it would be possible to modify my code to access this for filtering purposes. For example, one might create a twittering-filter-urls variable:

(setq twittering-filter-urls '("4sq" "http://4sq.com/.*" "http://adf.ly/[A-z0-9]*"))

and then add some code just before the line '(if (= 0 matched-tweets)':

(dolist (pat twittering-filter-urls)
  (if (string-match pat (cdr (assoc 'display-url status)))
    (setq matched-tweets (+ 1 matched-tweets))))

Would you be willing to try the above and let me know if that works for you?

kuanyui commented 10 years ago

@cvmat

I think that you can solve the problem by replacing (cdr (assoc 'text status)) with (twittering-make-fontified-tweet-text-with-entity status) in the function twittering-filters-apply.

Yes, this works right now! Thanks a lot!

@flexibeast

...the problem seems to be that all URLs in Tweet text initially arrive as t.co links, which are subsequently expanded into the actual URL by the Twitter client. In other words, if someone writes a tweet saying "I'm at GitHub 4sq.com/xxxxx", this will actually initially arrive as something like "I'm at GitHub t.co/yyyyy"

I also found this fact when I wrote a url parser for twittering-myfav.el, but I don't know how to get rid of t.co. (Though t.co made it easy to write a regexp to grab. XD)

(dolist (pat twittering-filter-urls)
  (if (string-match pat (cdr (assoc 'display-url status)))
    (setq matched-tweets (+ 1 matched-tweets))))

Would you be willing to try the above and let me know if that works for you?

@cvmat 's solution works too, and I prefer less variables set. So I've taken that, but still thanks!

flexibeast commented 10 years ago

@cvmat:

Thanks!

@kuanyui:

cvmat's solution is much more elegant than mine - cvmat knows the internals of twittering-mode far better than me! - so I'm completely happy with you using it. :-)

cvmat commented 10 years ago

I have uploaded a patch at https://gist.github.com/cvmat/9875148 . I cannot decide an appropriate name of the function... In the patch, I call it "excluded pattern", but what is excluded is a tweet, not a pattern. Is a name "excluded pattern" valid in English? I think that "a pattern of excluded tweets" is long-winded. "filter" is a candidate, but I think that it can be interpreted as both meanings, a pattern of tweets being excluded and a pattern of remaining tweets. This interpretation may be incorrect. Do you have any ideas?