miguelfreitas / twister-core

twister core / daemon
MIT License
1.42k stars 251 forks source link

raise char limit to 256 #76

Closed gubatron closed 8 years ago

gubatron commented 10 years ago

Twitter's 140 char limitation came from their want to interface with SMS.

As this is a p2p network, and SMS has moved a long way since twitter was born (everybody now has smartphones, etc.)

can we raise this annoying limit to 256 chars so that we can convey more interesting thoughts and quotes (without getting out of the 'microblogging' category?)

BlockTester commented 10 years ago

That might not be convenient for interoperability with the third world... something to think about.

gubatron commented 10 years ago

how can you interface now via SMS with twister? through a centralized component? ;)

BlockTester commented 10 years ago

Someone could write a bridge to other protocols. I could see someone wanting to plugin it into a self hosted VoIP system with gateways to, yes, centralized systems. I'm playing the devils advocate really because I too think 140 chars is a bit... restrictive.

miguelfreitas commented 10 years ago

Folks, I have no hard opinion on this. But there are several things i think we should consider before changing this.

iShift commented 10 years ago

I think that we need to add new field to tweet: 1) url 2) Name for url 3) MAGNET 4) MAGNET options

by magnet i mean that user make torrent with image (or different content) witch he want to post, then he start seeding that torrent to swarm

in magnet options we need to write what is it in magnet, text, image, different. by default we need to support only image and txt torrent.

and we need to set limit for MAGNET image torrent i think that 5 MB is good size for photos

by text torrents i mean read more button if user want to post big information.

i think that it is UBER function for twister =)

and in the end - we have 140 chars + content (image/readmore/other)

what is "other" - we can decide later by default all what not image and txt = other and NOT downloaded to clients auto

BlockTester commented 10 years ago

+1 @iShift I was thinking exactly the same thing. It's definitely an added value to be explored!

BlockTester commented 10 years ago

If we're going to add fields, maybe it could be useful to add a language field?

iShift commented 10 years ago

i think we need to rename that issue to "Add new fields" or close

kseistrup commented 10 years ago

I'm all for 256 chars. 140 chars seem archane and is too restrictive. We needn't use lowest denominator.

abiliojr commented 10 years ago

I believe dumbphones are great, and even when I haven't seen one in a long time (and I do live in a third world country btw), if 140 limit is an issue (and I believe it is more of an issue for a software that would allow you to, lets say, share in both Twitter and Twister), I say break them down when funneling the text through Twitter (or SMS).

For SMS, I wouldn't mind, as this exists and I believe is a de facto standard: http://en.wikipedia.org/wiki/Concatenated_SMS

Anyway, I believe that we should look in the future, not in the past. And yeah, for the future, there is still the issue of Twitter limit. That could be solved by a client software by either splitting the message (by spaces or punctuation marks) or using one of the already in place mechanisms for people that need to express more.

I vote for a limit close to 250 characters, and a separate field to include pointers to another things (like images and video on a Magnet link for example), but I would allow more than one, to be able to include several images and then address them as separate ones.

Other way I see we can fix the 140 vs a lot (I wouldn't mind reading a 500 characters message) is to keep the 140 limit, then within the extra fields add a pointer structure so longer messages can get automatically broken and distributed and then glued back together.

Having this, I guess we can offer a new reward for helping on the mining process. You can call it "long messages", and I guess regular Joes and Janes (and not companies) would love to share a bit of their CPU times in order to get a pack of 10 long messages (I'm just using my imagination).

Btw, when we talk about 140 characters, are those utf-8 or utf-16 characters?

Sorry for the long post, see, I wouldn't have made it on 140 characters anyway.

kseistrup commented 10 years ago

It seems ‘characters’ is ‘displayed characters’ and not ‘bytes/octets in message’.

gubatron commented 10 years ago

+1 to abiliojr's auto split idea for sms client delivery. I vote for the message size to be 256. if we can't have 256, then 160 is also SMS friendly (newer SMS generation) and those 20 characters do count when trying to say something in microblogging format.

iShift commented 10 years ago

main aim - link shrinker and new fields for images and magnet urls Next - message size, as i think. 140 is good, 160 + new fields - is ok

thedod commented 10 years ago

The need to shorten URLs keeps coming up in discussions, The way people usually implement shortening is via a central provider (and that sucks).

We can simply have "shortener posts" or "naming posts" which, unlike "text posts" (the only kind we have so far), are used to name a url (could be magnet as well). We could differentiate between post types OOB (i.e. not inside the "msg"of the "userpost"), but maybe it's easier and more backwards-compatible to decide on a de-facto textual standard like this: |short name|http://example.com/very/long/link I can then post: I went to |short name| and ZOMG and shortener-aware clients will

If I repost the same short name with a new link, it would only affect later posts (this way I can reuse |my latest blog post| :wink:).

What do you think?

wrewolf commented 10 years ago

@thedod i think [url name|http://url_link/] this

thedod commented 10 years ago

This idea has merged into something @miguelfreitas proposes where a url-shortening is not a "regular twist", but "a slightly different animal", so maybe it's possible to raise the limit there.

Anyway, I asked :wink: q-shortener-140

Erkan-Yilmaz commented 10 years ago

other ways to improve/bypass the char limitation:

(copied from May 24/25 twister conversations): RT @mrb If the #twister html UI can do (1/2) and (2/2) on split messages, could it not display the message as one? replies: RT @mfreitas Sure. Requires work but certainly possible. Some thought will be needed on how to handle RT, for ex. @erkan_yilmaz @mrb @erkanyuksel RT @kseistrup Message continuation, a proposal https://groups.google.com/d/msg/twister-dev/XJ0mmYzismI/5-MAL444j2AJ

rbertoche commented 10 years ago

I love Dod's idea! (not sure if I'm modding it a bit)

I think we should use some syntax for named links together with post chaining, and concatenate chained posts before parsing those links. I prefer the syntax used on Wikimedia, as someone else suggested.

That way links may span over 2 or more chained messages. No char limit for links, no need to change message size limit. On Mar 25, 2014 5:28 AM, "The Dod" notifications@github.com wrote:

The need to shorten URLs keeps coming up in discussions, The way people usually implement shortening is via a central provider (and that sucks).

We can simply have "shortener posts" or "naming posts" which, unlike "text posts" (the only kind we have so far), are used to name a url (could be magnet as well). We could differentiate between post types OOB (i.e. not inside the "msg"of the "userpost"), but maybe it's easier and more backwards-compatible to decide on a de-facto textual standard like this: |short name|http://example.com/very/long/link I can then post: I went to |short name| and ZOMG and shortener-aware clients will

  • Hide the "shortener post"
  • Show "short name" as a link in the ... ZOMG post

If I repost the same short name with a new link, it would only affect later posts (this way I can reuse |my latest blog post| [image: :wink:]).

What do you think?

— Reply to this email directly or view it on GitHubhttps://github.com/miguelfreitas/twister-core/issues/76#issuecomment-38540167 .

mrvdb commented 10 years ago

The length of messages people talk about most (and presumably care about the most) is the length in the /presention layer/, that is, being able to write a message of a certain length and have it handled as one message, not as one that is split up.

It seems that, taking the twister_html client as example, there exists knowledge in the system on:

  1. how many parts a messages consists of
  2. which part of this message it is (the "order" in the sequence)

The underlaying transport (DHT/torrents) may require a certain message length to function optimally. As @miguelfreitas notes, likely something below 15kb.

If the client can transparently split and re-assemble the 'transport packets', and this seems to be the case, the length of the message in the /presentation layer/ becomes 'just a configuration option' locally and would make the discussion of the length in that layer a bit easier.

rbertoche commented 10 years ago

Well, I though the concept of chained messages by itself would remove that limit to the user, and that the only idea about post splititng was a transparent one...

If I get what you're talking, that /presentation layer/ length is a local artifical limit to prevent long message spamming?

Still, this shouldn't affect links: If we choose a certain max length of a post composed of chained messages, this length should affect only those bytes that are printed, since that is about user interface only. Links could still have unlimited length, if we want. Most likely it would have another limit for some another reason.

This will be only brainstorming until chained messages come upstream...

On 27 May 2014 10:52, Marcel van der Boom notifications@github.com wrote:

The length of messages people talk about most (and presumably care about the most) is the length in the /presention layer/, that is, being able to write a message of a certain length and have it handled as one message, not as one that is split up.

It seems that, taking the twister_html client as example, there exists knowledge in the system on:

  1. how many parts a messages consists of
  2. which part of this message (the "order" of the sequence)

The underlaying transport (DHT/torrents) may require a certain message length to function optimally. As @miguelfreitashttps://github.com/miguelfreitasnote, likely something below 15kb.

If the client can transparently split and re-assemble the 'transport packets', and this seems to be the case, the length of the message in the /presentation layer/ becomes 'just a configuration option' locally and would make the discussion of the length in that layer a bit easier.

— Reply to this email directly or view it on GitHubhttps://github.com/miguelfreitas/twister-core/issues/76#issuecomment-44278760 .

D166er commented 10 years ago

I need more chars. And I don't like limits!

fredix commented 9 years ago

I agree with that, break the 140 chars limit please :) :+1:

Qqwy commented 9 years ago

To summarize so far: 1) People seem to agree that 140 is an outdated limit, and 256 is probably more natural in this modern age. 2) 'Decentralized Link Shortening' might be an option by sending a link as a separate message (which can be clearly differentiated from normal messages), which is linked to in some way by the normal message. 3) Attaching images (or music,video, other multimedia files) might be done by internally creating a torrent for these files, and starting seeding when the message is sent. In this way, we can decentralize these files as well.

miguelfreitas commented 9 years ago

@Qqwy good summary.

I'd just note that (1) increasing the limit is something that needs a careful deployment planning. Limit is enforced by all peers and you certainly don't want to start getting banned by (torrent) peers who believe you are violating the rules, right?

So what i'd think that could be feasible is to code the change with a time restriction, so the new limit would only be valid at some point in future. This would hopefully give enough time to users to update to newer versions including the (time-disabled) new limit.

kseistrup commented 9 years ago

Now another two months have passed. What do we do to make this happen? And what should the limit be?

Erkan-Yilmaz commented 9 years ago

See comment by @mrvdb above: >140 char could be changed in clients, without having to change twister "infrastructure" itself

slr commented 9 years ago

@Erkan-Yilmaz yep, it's true.

meanwhile I like @iShift's comment.

kseistrup commented 9 years ago

Another thing: Twister actually counts the number of UTF-8 chars. From twister.cpp:

    } else if( msgUtf8Chars > 140 ) {

which means up to 4×140=560 bytes, which again means that we could let people use up to 560 bytes per twist without worrying much.

ghost commented 9 years ago

(1)Is it possible to implement something like this:

a) Splitting longer post function should be kept, no matter what the decided character limit is.

b) There should be a way to check/calculate the average length of a split post, periodically.

c) In response to what we see in b), the character limit can be adjusted with every major release of Twister. If there is a tendency to use more characters that the current limit, the limit can be adjusted accordingly. (In other words, this is going to be a policy decision of the project to check and make this change periodically)

(2) Splitting posts function should make sure urls don't get split

(3) To me it is very difficult to have a meaningful conversation within the limit of 140 characters. If we can have 256 characters, that surely is a good start. From what I have seen, people who post to Twister don't generally expect the same notice to appear on Twitter - which is the only social media platform that restricts users to 140 limit. I personally prefer having no limits at all, but then this will stop being a microblogging service. So having some limit is okay :-)

blog2read commented 8 years ago

Yes - 140 characters are really outdated - why copy #twitter here? It could be more then 256 - so then it is possible to write two ore three sentences. I don't see to implement it as "splitted" - this makes a text unreadable. Yes please - on my GS instance I have 1024 - that's cool.

jpfox commented 8 years ago

Yes 140 chars is too short. In fact it is more difficult to write a message for some languages like german or french where words and syntax use more characters than english (sorry, I don't know asian language structure). And yes 256 seems to be a good choice : longer than actual restriction but not too long to be able to read rapidly stream of twists. 256 UTF8 chars always fits in 1kb. Idea for specific additional fields like url or language code (which can be used to filter stream on languages you understand) is very good too. Often, long url uses to many characters in the twists message.

That's my opinion :-)

miguelfreitas commented 8 years ago

Let's recap what message size limitations we currently have (hard) coded in twister and have a frank discussion about it.

1) acceptSignedPost

2) node_impl::incoming_request (DHT's "putData" primitive)

These two function have different usages.

While (2) is used just for accepting/reject DHT resources, (1) is used pretty much everywhere.

The consequence of violating (2) is not severe: some posts will not store on multi-item DHT (replies/hashtags) but will continue storing fine as individual DHT entries. Then if we increase the limit they will get stored again as users start upgrading their twisterd clients.

It is (1) that worries me...

The problem is that this function is also where signature is checked, and it is called from libtorrent's code while validating pieces. So if we change the validation rule, older clients will be in disagreement about whether the piece is good or not. Receiving an "invalid" piece multiple times will cause peers to ban newer clients who seem to be "misbehaving". Therefore I'm somewhat afraid of triggering this backward incompatibility and degrating the network.

One possible idea is to overflow extra text from "msg" field into another "msg2" field. That will cause older UI clients to just display part of the text, but does not cause any incompatibilities.

Comments?

stman commented 8 years ago

Hello Miguel, and all other contributors,

Just a little info.

FYI guys, as I have been designing SMS/MMS routers connected to GSM's SMSC's in a previous Start-Up (NEOTIS TELECOM), I can ensure you that the 140 char limitation is not a limit. Original GSM Standard says that SMS of 140 char can be concatenated up to 4 SMS's if I remember well said Standard. If you want, I can dive again into the relevant GSM standard and show you the page where these few things are defined to check with up to how many concatenated SMS any basic GSM handset can manage & must be compatible with.

stman commented 8 years ago

Ok, then, I have the answer about the SMS messages limitations according to the original GSM Standard. Please download the "SMS Short Message Service" Core GSM Standard here : http://www.etsi.org/deliver/etsi_ts/100900_100999/100901/07.05.00_60/ts_100901v070500p.pdf

Then go to page 63, and start reading section 9.2.3.24.1 but also section 9.2.3.24.8 page 68, and you will know everything you need to know about GSM SMS message size limitations....

miguelfreitas commented 8 years ago

Thanks @stman but i'm not much worried about SMS anymore... People convinced me we don't need to care. I'm worried about compatibility within our own protocol.

stman commented 8 years ago

Excuse me @miguelfreitas , I did read the post too rapidly, and didn't see somebody had already given some infos about concatenated SMS.... Glad to hear you don't worry any longer about msg size anyway :-)

jpfox commented 8 years ago

miguel, you know as no one how twister works and your analyse seems good. Spliting overlapping text to a msg2 field is a good idea. it can garantee retro compatibility and network stability. I encourage you to work in this direction

mrvdb commented 8 years ago

Have we exhausted the options without changing the underlaying protocol? It seems to me, but I may be missing something, there is enough information for client software to re-assemble messages which have been split.

It feels like this problem should not be solved in the transport layer.