abbysmal / Canopy

A git-blogging unikernel written using MirageOS
ISC License
120 stars 27 forks source link

/tags should give a list of tags #30

Closed hannesm closed 8 years ago

hannesm commented 8 years ago

currently it returns a not found

voila commented 8 years ago

How would you get the list of tags ? Do you propose to walk KeyHashtbl and for each article to collect its tags ? Or should we get the tags directly from the Store module ?

hannesm commented 8 years ago

I'd naively walk the KeyHashtbl.. if it turns out to be a performance bottleneck, caching can be done later

abbysmal commented 8 years ago

I guess walking through KeyHashtbl and building a Set of all tags is enough for now, yeah. We'll cache it one day if needed. No idea on where to put the /tags link, though

dbuenzli commented 8 years ago

Is this done statically or dynamically in the unikernel ?

abbysmal commented 8 years ago

dynamically in the unikernel Everytime new content is pushed to the repository, after pulling it, Canopy walks through the repository (could be better) and cache parsed articles in a hashtable. Tags are available in each article records in this hashtable

dbuenzli commented 8 years ago

Then I'd rather use a Map.Make than a hashtable. This a fullproof way of preventing certain class of dos attacks on hashtables (e.g. if the contents can be user-contributed). In general I don't understand people who start with hashtables first... always start with maps and only consider replacing it with a hash table if that turns out to be a problem.

abbysmal commented 8 years ago

Got this, even though I'm not too sure about how user-contributed the content really is. Contributing content imply having a write and a push access to a git repository (implying the push access is sufficient to trigger the push hook mechanism that the user can define). I guess we can move to a Map with no problem, but how security is handled in Canopy is still a bit blurry…

dbuenzli commented 8 years ago

It may not even be possible to exploit this. It's just an algorithmic mindset: by using a Map you just evacuate this possibility from the software altogether.

Also as the software evolves that bit of code may end up being (re-)used in another context or c&p in other software without much thought, but in a use case where the problem may become relevant.

In any case if you stick to hashtables it would be better to create the custom hashtable with Hashtbl.MakeSeeded and redefine the create function in KeyHasthbl to automatically create hashtables with ~random:true.

abbysmal commented 8 years ago

I agree, I will then move to Map.Make, I don't see any problem in doing so, and @hannesm seems to agree too. Feedback greatly appreciated, thank you :-)