Comment box - Githubissues

Incognito commented 12 years ago

I've added a comment box on the /Data page, it needs to be integrated with Peter's classes, which means moving the page to a PHP file and myself setting up the SQL.

We should also place a box on the landing page.

I'm unclear where we'd intend to place output from the comments on the site, please discuss.

I believe comment box needs to expand a field in the database to manage various concerns such as "feedback" or "data set requests", etc.

prsantos-com commented 12 years ago

When I was testing the comment box, I didn't like how displaying all the comments looked like, or even displaying the ten most recent ones. It could be because I didn't properly format it, I just spit out it out as submitted. I felt it made the page unnecessarily long.

I'm thinking we could have a separate page for comments and data set requests, where the page would be "What the Public is saying" or "What everyone is saying." This page could constitute an Open Hamilton 'Open Forum.'

Just an idea.

Incognito commented 12 years ago

What are your thoughts on using google moderator?

https://sites.google.com/site/moderatorhelpcenter/home/embedding-moderator

JoeyColeman commented 12 years ago

I used Moderator a few times and it's not a great product for embedding. It merely iframes the Moderator page complete with headers and other filler content.

prsantos-com commented 12 years ago

Functionally it's pretty neat, but it looks clunky. Though it seems as though you can style it a bit. Also, I don't like how the Google name is in your face like a cheap ad, but it is already built so...

We could probably copy the tallying part, for instance:

123 questions asked | 43 comments left | 1908372 requests for data sets

[View Questions] [View Comments] [View Data Set Requests]

[Submit Question] [Submit Comment] [Submit Data Set Request]

Of course, we'd have to build it. I wouldn't mind putting in some elbow grease, because I don't think it would be too difficult. Just need to add 'category' column to the table like you mentioned before and add some methods that specifically pull up a category of comments. Styling of course is the harder part.

Incognito commented 12 years ago

How would we like to deal with spam/abuse?

Publish everything that happens (deal with it after-the-fact)
Use a login (oAuth http://ca3.php.net/manual/en/oauth.examples.fireeagle.php )
Review comments before publishing
Something else?

prsantos-com commented 12 years ago

Publish everything that happens, use CAPTCHA if anything.
We can implement an optional login, but if people start posting ridiculous things, then we should require a login.
We can review comments if people start writing offensive things, or only review anonymous comments.

I think we just need to concentrate on letting people comment as freely as possible so we can generate interest in Open Hamilton. If that freedom gets abused, then we can tighten things a bit. We'll just repeat that process.

Incognito commented 12 years ago

I'm investigating ways to prevent abuse without captcha or logins.

Captcha is okay but generally is a pain for the person solving them to squint at and they just end up saying "forget it!" I'm looking to see if we can do two things, both classify abuse messages, as well as add non-obtrusive bot prevention (ie, simple bots don't check hidden input validation tied to a session, but some advanced ones do).

Go ahead with your plans to improve the commenting system. If you're unsure of the architecture I could draw up some UML for you that would fit nicely.

prsantos-com commented 12 years ago

Yeah, I feel the same way about Captcha, it's annoying. The site raisethehammer.org uses a natural language logic test like, "What do you get if you multiply 5 and 1?" I haven't seen any weird spam on there, so I'm assuming it works, but I really don't know.

Okay, I'll put something together and see what works. I'm going to add an alias/name box, but I'll leave the 'logging in' out. I'll put in a category in the DB, and decide if it the user should be able to select it.

You can send me some UML if it's not a hassle, this would give me more ideas and a general direction, but either way I'm good.

Incognito commented 12 years ago

Natural language logic tests are trivial to break: http://www.wolframalpha.com/input/?i=What+do+you+get+if+you+divide+12+by+3%3F

I'm thinking http://sblam.com/ is a suitable tool...

http://sblam.com/install.html Returns a value between -2 and 2 representing the "level" of the message which we could use to classify spam with. 2 can get auto-published, 0 and below can be sent to a holding status for approval.

I'd like to see if I can find a Bayesian classifier that isn't via third-party API (latency in response time) but this or http://akismet.com/ are promising looking. I suppose the lag time for a response could be negligible and really just a "nice to have" rather than something we need to do. Running our own classifier is one of those rewards vs simplicity things.

Incognito commented 12 years ago

Okay so let's see if this makes any sense:

http://imgur.com/6Mx1h

Left side is SQL database, right is the objects.

Database contains responses and comments in two tables so we can quickly query each. Published value is something you didn't have in the last one-- basically, it's a way we can turn "off" a post or come back and review it.

ActivityLog is a feature we may wish to implement -- in theory we'll use it to ensure people don't just hammer away on the "upvote" button. It basically monitors each request in a light way that checks against when, what IP and an event long description -- so, if some IP already has the event log item "Posted message MD5 and session cookie Y" we can kick them back out, simmilar idea with spamming upvotes.

I haven't really drawn any class provisions for it as I wanted to hear your thoughts on it.

Feedback classes basically just abstracts your comment object into a feedback one, where we get response/comments out of it.

As with most UML something won't make perfect sense, but as a general image I think this is a good way for you to get started on the next version. Make any changes you see fit. -- (for instance I just noticed Feedback class doesn't account for the Category, do'h!)

justinkwanlee commented 12 years ago

I just wanted to chime in about the Spam Protection. It isn't necessary to use CAPTCHA and I agree it is cumbersome for the end user. You might want to take advantage of honeypots, basically fake hidden fields that a real User can't see, but that the SPAM bots will gladly fill out. You then do a check serverside to see if it's filled out, if it is, then it's fake and you can leave it outt. This isn't foolproof of course, but it filters out about 80% of bots. You can read a bit more about it here (second paragraph) : http://www.scirra.com/blog/61/reducing-website-spam

Aksimet (http://akismet.com/) is also a decent alternative...it's used extensively in a bunch of other projects and it's got a good (well documented) API.

What do you guys Think?

Incognito commented 12 years ago

Hey Justin thanks for the feedback and taking some interest in this :).

I believe we've already decided on using hidden random inputs, the main issue is the spam classification. We can either use askmet, sblam, or implement our own local one. Do you have further thoughts on this (see previous 4 posts for details)?

justinkwanlee commented 12 years ago

Aksimet is being used by a lot of existing open source systems out there...in my experience it has done a good job of filtering out unwanted spam since its dictionary of known spam is quite large. I wouldn't build one from scratch, unless you just wanna learn how to do it.

prsantos-com commented 12 years ago

I looked at sblam and akismet, and if nobody has any objections, I'd like to pursue sblam. I think it has an interesting classification approach and I'd like to see how well it works. If it doesn't do it's job that well for any reason, we'll just switch to akismet.

prsantos-com commented 12 years ago

For responses, should we take into account that someone may want to respond to someone's response? Or, should we just keep it as a single column like The Spec and youTube, eg:

I think you guys should put up data set that lists cheap motels. alcoholicsementhrower 1 day ago

@alcoholicsementhrower Besides perverts, who benefits from this data? lonelygirl49 2 days ago

Hamilton needs more data sets. boinktheclown22 2 days ago

OR reddit style:

alcoholicsementhrower 1 day ago I think you guys should put up data set that lists cheap motels.

lonelygirl49 2 days ago *Besides perverts, who benefits from this data?

MrPerfect09 2 days ago It benefits me :)*
boinktheclown22 2 days ago Hamilton needs more data sets.

I'm thinking I'll try nesting the comments, because I don't think anyone is going to be a douche and start creating a huge nest with their own comments.

So, instead of the Response table in the DB, I'll have CommentResponse table which will have 2 columns. One column will hold the parent CommentID and the other will hold the child CommentID. How does that sound?

Incognito commented 12 years ago

I don't see any serious problems with SBlam and as you said, we can change it if need be.

For responses, should we take into account that someone may want to respond to someone's response? Or, should we just keep it as a single column like The Spec and youTube, eg:

For the time being I'd say single-thread is the easiest thing to do, in terms of code complexity and simplicity to an average computer user.

You have the ultimate decision on the comment stuff; you're the one making it.

prsantos-com commented 12 years ago

Sounds good, I'll do single-thread. In your UML diagram, what was your implementation idea for the Responses table/object? Just curious.

Incognito commented 12 years ago

A response would be the main thread with comments being children of a thread, we index responses on the main view and perhaps preview a few responses to the side of the main. The PHP object would take it from the table, we could create the array of the responses as objects that we work with later based on a select query.

I'm not sure what specific details you're looking for about the table/object.

OpenHamilton / openhamilton.ca

Comment box #3