xoopscaptcha configuration

GregMage commented 8 years ago

in \class\captcha\config.php there is this message: /** * This keeping config in files has really got to stop. If we can't actually put these into * the actual XOOPS config then we should do this. (Who said this? You are right!) */

I agree with that!

geekwright commented 8 years ago

:+1:

In 2.6, I intend to make captcha a service. Essentially, a captcha service provider could come packaged as a module, with its own preferences to configure it. Everything would work the same way. What we have right now is too difficult to manage.

If you have any ideas on making it better in 2.5 I'm listening :ear:

GregMage commented 8 years ago

I think we must create several strategies to fight against spam. My idea is to create a new extension to the module system. This extension would configure the captcha systems but not only. Exemples:

Choose for which group a captcha form has to be displayed.
create custom rules to display the captcha (display the captcha below a number of post.)

This extension would be good for XOOPS 2.5.9 and a good protector filters complement.

We should also modify the registration page and all the necessary place in xoops and essential modules to add a captcha.

geekwright commented 8 years ago

I'm going to throw out some ideas about the direction I have been planning on taking this.

Some basic principles and assumptions.

A positive user experience is the goal

Preventing SPAM is only important because SPAM detracts from the user experience for legitimate site visitors. Attempting to prevent SPAM by making the legitimate user experience more awkward would be counterproductive.

Captcha is not an ideal solution

Captcha attempts to discern if a site page is being manipulated by a bot. Bots are getting better at emulating humans within this realm. Legitimate users suffer, as captcha is an awkward extra step, and that extra step requires tests of perception or cognition which can often discriminate against persons with certain handicaps, as well as those of different cultures or different proficiency with the language.

Not all SPAM comes from BOTS

Low wage human labor often offers a competitive cost benefit ratio compared to bots. Humans, especially those operating at this low wage level, may be slightly impeded by captcha, but not stopped.

Solutions must be system wide

To be effective, an anti-spam solution must protect the entire system, and not be dependent on changes to module specific code.

Any solution should be portable to the next generation

XOOPS is on the edge of a big transition. Any solution needs to keep that transition in mind. A design that requires changing dozens of programs will make it difficult to port. The time invested in the solution should be an investment in the future, not just in the present.

So what does that mean?

We need to back away from the point where a captcha would be deployed to get a better perspective.

A legitimate user has a different pattern of behavior than a spammer, be it bot or human.

A legitimate user looks around, and reads articles and forums.
When a legitimate user makes a posting, the time between the form being presented and being submitted varies,
and potentially many more patterns

We don't capture this kind of data. We should. We can.

Instead of refitting every form to present a captcha, we should focus on a solution which operates at a higher level. We have events (preloads) triggered a specific points in a transaction. We can easily add events if needed, they are cheap and easy.

We should present challenges, such as captcha, when actual behaviors suggest there may be an issue.

A general concept for a User Behavior Layer

Create a persistent storage for user behavior data. Rather than define a traditional set of columns, I would suggest a minimal table, such as a user id, a review needed flag, and a big JSON column. We can easily add or change techniques without a bunch of database updates every time. Spammers adapt, we should be adaptable if we need a new metric.

A central mechanism to manipulate user behavior data is a must. For example, there are checks and filters in protector that should be aggregated into the user behavior data -- triggering a spam filter should be recorded.

Capture a running tally of page visits in the session. The 'eventCoreIncludeCommonStart' would be a good place to sniff this. We can obtain the session start time, last transaction time, the method (GET/POST,) and more from here. We need to capture in the session, so that the behavior before signing in can be captured, and merged into persistent storage for the user when/if the user signs in. (User may start reading, and only sign it to reply, for example.)

Add an event trigger in XoopsForm::__construct() so that we can add behavior data into the form. For example, the elapsed time between form and submit could be valuable, both as a stand alone item and as a user average. We could add that data in the form, ensure it is tamper resistant with JWT, and collect it all without ever touching existing code.

Instead of depending on the form processing code in each program to converse with captcha, we should consider an interstitial (in-between) page, if and only if there is a need. We can determine up front that extra interaction is required, such as captcha, or identity confirmation. Instead of proceeding directly to the intended program, we can divert the post attempt to a program dedicated to the interstitial responsibilities. It will capture the post data to continue the submit if the user response to the challenge is successful.

At this particular point, Google's reCaptcha is the state of the art. It draws on many factors to make its determination, based on advanced heuristics and more data points than we will ever see. While this choice should ultimately be configurable, a working solution that fixed this as the configuration would be an acceptable solution for the first cut.

To apply the results, there should be an easy way for an administrator to review the data associated with the user (remember the review needed flag?) and take actions, such as deactivating, or deleting the user.

Path forward

There is a real need for improvement in the way XOOPS handles spam. This concept is an attempt to meet that need. As stated, it represents a fairly small effort, and should be easy to deliver. But, there is no doubt this will be an ongoing iterative process.

Feedback is needed, and a general consensus on the acceptability of this approach should be reached before more work starts.

Protector functionality overlaps this domain. Including the user behavior layer in protector makes sense, and would probably be the quickest and most direct path. If we elect to add the user behavior layer as free standing, we should consider moving protector's spam related checks into that new layer so they can function as a unified set, leaving protector to deal with injections and dos attempts -- it would handle just the hostile actions, not the experience degradation.

Give this some thought and share any concerns and suggestions.

Note: I will have limited availability for the next week, so I'm not intentionally ignoring anyone. Thanks for your patience.

GregMage commented 8 years ago

I agree with your vision. together we can work on that if you want? I think it is still important to create a new extension to the module system to configure the XOOPS captcha.

geekwright commented 8 years ago

I am counting on working together on this! Thanks!

I really want to focus the most of the XoopsFormCaptcha class upgrades into 2.6, instead of 2.5. See #228

We have a great chance to fix this whole area properly in 2.6 by introducing new patterns that correct all the issues you are finding in the legacy implementation.

We'll add the User Behavior Layer using protector filters in 2.5, and port it to 2.6. I'll be breaking that out in more detail. I'll also be reviewing the Recaptch changes a soon as I can. Looks good so far, but I'm still catching up.

Thanks again.

GregMage commented 8 years ago

You're right, we do not change 2.5.x and we do things properly in 2.6

XOOPS / XoopsCore25