buckket / twtxt

Decentralised, minimalist microblogging service for hackers.
http://twtxt.readthedocs.org/en/stable/
MIT License
1.92k stars 79 forks source link

How to store metadata about a feed #48

Open Benaiah opened 8 years ago

Benaiah commented 8 years ago

A number of different issues and ideas have made clear the need for a place to specify metadata about a twtxt.txt feed. For instance, essentially every idea for notifications so far needs to know where the notifications should go (technical details vary based on the proposal). The question then is how to store metadata.

Discussion in #22 has suggested a general comment character, thus allowing clients to handle individually how the metadata would be stored. I suggest building on this, allowing for general comments, but make the following format specifically for metadata:

# this is a regular comment

# the next line is a metadata entry
# nick = benaiah

This echoes the .ini format of the twtxt config file, which I think gives it a nice consistency.

The other main suggestion for metadata is to have another file. I dislike this approach because it complicates the protocol, significantly increases how much twtxt has to hit the network, and requires either a second URL for each person (for the metadata file), switching twtxt.txt to hold metadata and having another file hold the feed, or putting a metadata entry in twtxt.txt that points to the metadata file.

mdom commented 8 years ago

@adiabatic I'm a big fan of the #TIMESTAMP\taction syntax. I just had the feeling that there was a movement for the irc style metadata. I think we just need to decide for one solution. @DracoBlue, @quite What about TIMESTAMP#action?

DracoBlue commented 8 years ago

Ok for me, too. Can somebody try how twtxt and current registries behave if this is in the feed?

mdom commented 8 years ago

Let's find out. I just updated my twtxt.txt with both version.

DracoBlue commented 8 years ago

http://twtxt.reednj.com/user/8c8d189d1c6f8810

Handles (0) like a normal "post". The others dont appear.

roster, registry and twtxt-ui ignore all versions in your posts.

Am Mittwoch, 23. März 2016 schrieb Mario Domgoergen :

Let's find out. I just updated my twtxt.txt with both version.

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/buckket/twtxt/issues/48#issuecomment-200269476

http://dracoblue.net

mdom commented 8 years ago

twtxt dies with a stacktrace when parsing (1), but ignores (0) and (9). Seperating timstamp from metadata with a hash sign, seems to be ignored by all clients. And we could still allow any kind of ws for normal tweets. :+1:

DracoBlue commented 8 years ago

Ok.

So:

TIMESTAMP#action param1

is the final version?

archusr commented 8 years ago

Shall we vote? Until when? (Wait for >50% of 14 participants (=8) in this thread?) https://doodle.com/poll/gh27hhtixvbttvdp Result so far:

2016-03-24T20:15:00+01:00#action
timofurrer commented 8 years ago

Let's vote in here with the emojis ;)

DracoBlue commented 8 years ago

I think 4 votes is clear! ;)

mdom commented 8 years ago

txtnix and twtxt-roster both support the new syntax.

buckket commented 8 years ago

I have a few questions here:

After giving it some thought, I’d rather stick with a very simple, yet robust concept:

# Hello, this is a comment, it should be ignored.
# This is my twtxt feed, be welcome!
#
# NICK = buckket
# LINKBACKURL = http://example.org/linkback
# FOLLOWINGS = http//example.org/followings.txt
2016-02-25T18:11:02+01:00   Rather busy this week, will try to resolve some issues with twtxt soon!
2016-02-25T18:11:31+01:00   Especially the metadata situation needs some attention.

This way we can strip all the unnecessary metadata by removing lines starting with #, thus getting all the raw twts without much parsing work. E.g. by using: sed '/^#/ d'. That illustrates the idea and intention behind twtxt very well. Keeping everything so simple that you can modify, extract and use the data with simple shell commands. Other benefits are the easy parsing and the rather clear optical differentiation between content and metadata.

Another reason why it might be good having metadata at the top without having to go through the entire file: HTTP Range Requests. If you want to check only the metadata, request only the first x bytes, where x is a number big enough to house all relevant information.

Sorry for not responding sooner.

mdom commented 8 years ago

On Thu, Mar 24, 2016 at 08:45:55AM -0700, Felix Bayer wrote:

  • Do we always need a timestamp prepending metadata? There are plenty

I think prepending a time stamp makes things easier for twtxt clients as we can still just append to the twtxt file. If we put the metadata in the header, i have to rewrite the twtxt file every time metadata changes. And i probably should flock the twtxt file then. Whereas appending is atomic on linux up to 4k. And 512 bytes on most unixes.

And if we decide to not add a timestamp and in five weeks we find a metadata where it would be really usefull to add time information, the ship sailed. It would be nice to have the most general solution.

Maybe we can have an optional timestamp and in case it's missing we just assume now() for ordering?

  • If you want to do a /me-style message just use *having a good day*, adding a new kind of message type, which then is displayed differently in the client is unnecessary.

I don't think anyone propsed that. The discussion was if the /command syntax would make it impossible to use /me in the beginning of a tweet. How to display the /me should be up to the client.

After giving it some thought, I’d rather stick with a very simple, yet robust concept:

# FOLLOWINGS = http//example.org/followings.txt
2016-02-25T18:11:02+01:00 Rather busy this week, will try to resolve some issues with twtxt soon!
2016-02-25T18:11:31+01:00 Especially the metadata situation needs some attention.

This way we can strip all the unnecessary metadata by removing lines starting with #, thus getting all the raw twts without much parsing work. E.g. by using:sed '/^#/ d'`.

I always likes the # ts metadata idea. Maybe with an optional ts and no requirement to add it at the beginning of the file?

Another reason why it might be good having metadata at the top without having to go through the entire file: HTTP Range Requests. If you want to check only the metadata, request only the first x bytes, where x is a number big enough to house all relevant information.

If we just append, we could remember the end of the last request and only request new lines. But as the order of tweets is not defined and users can change their twtfiles in the middle, this is not happening... :)

archusr commented 8 years ago

Just a wild idea for now to keep it simple and open:

# This line some random comment.
# @nick mynick
# @nick[2016-03-24] mynick
# @followings url http://twtxt.org/followings.txt
# @followings json [{"url":"http://twtxt.org/twtxt.txt", "nick":"twtxt"}, {..}]
# @follow[2016-03-24T21:33:47+01:00] twtxt @<foo http://foo.bar>, @<eg http://eg.org>

i.e. parameter[optional date/timestamp] literal or datatype and value

DracoBlue commented 8 years ago

one vs two files

If we want to put some meta data in an extra file: let's put most of the data in this extra file.

Having

# meta=https://dracoblue.net/twtxt.meta

at the beginning, would allow us to reuse the ini style of twtxts config with its content:

https://dracoblue.net/twtxt.meta:

[twtxt]
nick=dracoblue
twturl=https://dracoblue.net/twtxt.txt
[followings]
buckket=http://buckket.org/twtxt.txt

Having additional #twtxt.nick=dracoblue in the twtxt file to avoid the extra request, would be nice, but not really necessary.

The advantage of this approach is, that the range requests could really be applyable, since the meta head wouldn't change at all or that often.

The information about followings and so on, would be nice to "display" a profile page (like in twtxt-ui) and to have a officially supported way store the information.

timestamp for meta

If I can see in my timeline, at which time one of my followings started to follow somebody, it's quite nice ;). Having /me likes this resolved to * dracoblue likes this is nice in the client, bur no problem if the client doesn't have this magic.

smeagolthellama commented 5 years ago

are there any conventions about this stuff yet? Or, just in general, any progress?