buckket / twtxt

Decentralised, minimalist microblogging service for hackers.
http://twtxt.readthedocs.org/en/stable/
MIT License
1.94k stars 79 forks source link

Evil escape sequences #60

Closed kseistrup closed 8 years ago

kseistrup commented 8 years ago

Just a warning for those of you who are writing twtxt terminal clients: Please visit https://mosh.mit.edu/ and search for “Careful terminal emulation”. While we still ought to allow unicode, we should probably think about sanitizing each tweet before displaying it.

~@kas

kseistrup commented 8 years ago

A quick and dirty solution could be to ignore tweets that doesn't pass the .isprintable() test. TABs and the like are not deemed printable, but we could easily convert all whitespace to proper space chars (and collapse multiple whitespaces at the same time),. e.g.:

def collapse(text):
    """Collapse multiple whitespaces and test for printability"""
    collapsed = ' '.join(text.split())
    if not collapsed.isprintable():
        return None
    return collapsed

Caller should, of course, check return value for is None.

buckket commented 8 years ago

My hacky solution was to call click.unstyle() when parsing new Tweets, which then removes all escape sequences. Not sure if this sufficient, though.

kseistrup commented 8 years ago

@buckket thanks for reminding me of click.unstyle().

kseistrup commented 8 years ago

I have added an @evil stream at /evil.txt on the server where my default stream can be found. The first five lines read

2016-02-11T13:33:59+0000    This is a TWTXT file, please see <https://github.com/buckket/twtxt> for details.
2016-02-11T13:36:48+0000    WARNING: This stream may contain overly long lines, evil escape sequences, binary fluff, and other non-standard content.
2016-02-11T13:39:49+0000    The file is NOT intended to do harm, nor intended for public consumption.
2016-02-11T13:40:51+0000    Rather, it could be used by developers to test their TWTXT clients against a potentially malformed file.
2016-02-11T13:43:02+0000    *** PLEASE PROCEED AT YOUR OWN PERIL *** YOU HAVE BEEN WARNED ***

I do not want people to stumble over this file accidentally and think I'm doing this with a malicious intent, so I'm not posting the direct link. However, you should be able to find it with minimal effords — especially if you are already following me.

buckket commented 8 years ago

@kseistrup Thanks, will test against later. :)

erlehmann commented 8 years ago

@kseistrup I could use that file too. Where is it?

kseistrup commented 8 years ago

@erlehmann same server as @kas' twtxt stream, with filename /evil.txt instead of /twtxt.txt.

buckket commented 8 years ago

1455358172

Guess this still needs some work, as click.unstyle() does not solve all of it. Will try the isprintable-approach.