niemanlab / openfuego

Watching Twitter all day—so you don’t have to.
http://www.niemanlab.org/fuego
MIT License
176 stars 53 forks source link

"Freshness vs. quality" algorithm doesn't work #5

Open iwanv9 opened 11 years ago

iwanv9 commented 11 years ago

Hi there,

Thank you for your great application. It's something I'm looking for for a long time, now I found it. Great for my work as a journalist. :)

I have one question. Everything works and is set up, but... the "freshness vs. quality" algorithm doesn't seem to work. For example in your examples/getItems.php file: when I use "getItems(10, 1, FALSE, TRUE);" everything works fine, but when I try "getItems(5, 1, TRUE, TRUE);" there is no output. And that's a pity, because it's such a great option!

Hopefully you can check where it's going wrong (or can you tell me where I can check that). If I can do anything for you (check settings, files, db whatsoever), please let me know.

Thanks in advance!

Iwan

phelps commented 11 years ago

Hi, Iwan:

The freshness vs. quality algorithm — which we'll call the salsa — won't return any links until OpenFuego has been running for several hours. I should make this clear in the documentation. That's because it has a minimum threshold for what makes a "high-quality" link, which is proportional to the number of hours you specify.

Have you left it running for awhile? Want to try it again now?

I'm also working to improve Fuego's logging/error reporting so these issues aren't so cryptuc,

iwanv9 commented 11 years ago

Hi Andrew,

Thank your for your quick reaction and sorry for my late one!

I have OpenFuego running for some days now (on my own server, root user). When I turn the salsa off it works great. But the salsa still isn't a success over here: sometimes it shows some items (2 or 3 for example) while I'm asking the script for 20 items. Other times the script doesn't return ány results (although the database is quite full). Isn't that strange? Shouldn't the example-files always show the amount of items I'm asking it for? (And when it runs out of 'hq links', add most recent lq links or something like that?)

Another small question: is there a possibility to find out what links are 'trending' right now (so let's say: in the last five/ten minutes). I'm always interested in the latest (and most important) news, couldn't your great script be a basis to show me that?

Have a great weekend!

Iwan

Mayeu commented 11 years ago

Hi,

I have set up an OpenFuego instance and I face the same issue. Can you explain in details your choice for the scoring strategie ? Why 2.5 and 8 as constants ? How did you choose those ?

I think this constants may be too restrictive depending of the list of users followed, the subject, and the global activity of the network. Maybe we should think of value that are not fixed, but variable depending the context (median value, third quartile, something like that :)).

Thank you for any hint.

arejay73 commented 10 years ago

I am hiving the same issue with the algorithm as well. Has the documentation been updated to explain how this works?