fabacab / pat-okcupid

Alerts you of potential sexual predators on OkCupid based on their own answers to Match Questions patterned after Lisak and Miller's groundbreaking academic work on identifying "undetected rapists."
Other
31 stars 3 forks source link

Allow users to subscribe to inverse of any red flag list #7

Open fabacab opened 11 years ago

fabacab commented 11 years ago

This is a very important feature. If people create red flag question sets that, for instance, red flag feminists, then this feature would enable feminists to USE that list in order to red flag "people who are NOT feminists."

See also https://twitter.com/focalintent/status/331920686614450176

unquietpirate commented 11 years ago

Huh! Interesting I hadn't envisioned it that way. I'd figured you could just select the "Red Flag Feminists" list and then consider red flags to be positive, rather than negative, markers. But, of course, this wouldn't work if you were simultaneously using other red flag lists as negatives.

So, subscribing to the inverse makes more sense -- but might be visually overwhelming in cases where the majority of users are red flagged. What about a different visual indicator for "positive" lists. e.g. Predator list is boxed in red, feminists list is boxed in green?

fabacab commented 11 years ago

We need to work out what a "positive highlight" would mean, as well. See #4 for related discussion.

focalintent commented 11 years ago

So, in a way I think this is one direction OKC was starting to go down when indicating how important a question was to someone. Unfortunately, I don't have a rapidly handy list of all the questions that were absolutely important/vital to me - even worse, many of the questions I answered nearly a decade ago, when I was ... shall we say less aware than I am now (if I see an answer that needs fixing these days, I do. Alas, at the rate that I poke around on there, this results in 1-2 questions/year getting fixed).

In a general sense, this seems to be sliding towards providing a way to group questions and divide answers for those questions into a flag/don't flag camp[1].

While the initial drive for this was to identify abusers (in which case, a red flag which, I would like to think would very clearly be a positive/negative identification), and the second level was a way to work around people who might try to abuse the question grouping (e.g. people who make a "find feminists" to redflag them, allowing feminists to turn that around to either a) find people who are -not- feminists and/or find people who are if they're still interested in using OKC as a way to meet/find people[2]), I find the classification possibilities to be very open/interesting.

The data miner in me is curious - what kinds of classification/question groupings would people put together? What do those groupings that they put together (and/or the other-people-curated groupings they select and whether for positive/negative flagging?) say about that person? About a wider group of people? Greater culture/society, regions, various levels of mainstream vs. not, etc...

How obvious would attempts be to gain the system?[3]

However, I think i'm drifting out of scope, here :)

[1] worth it to extend to multiple levels? I'm inclined to say no, personally - c.f. pandora's simple thumbs up/down mechanism vs. iTunes 5-star rating system - but this is a kneejerk bias/reaction fueled by spending a couple of years in the digital music space. [2] As a shy introvert (for both of those axis), who has concerns about inadvertently putting someone in a position where societal/cultural pressures them to say yes, I can count on one hand the number of times I have out of the blue messaged someone on OKC that I didn't already know, and still have fingers left over to play the piano - so that generally isn't my use case :) [3] interesting idea - a second order set of questions that act as a backup/validator for the odds of honesty in the answers to the first order set of questions?

focalintent commented 11 years ago

(Apologies for the brain dump - I'm in that cycle with my ADD meds where my fingers and hind brain take off running on me)

focalintent commented 11 years ago

Out of curiosity - is the tool centralizing users/answers? I can see that improving lookup times for someone browsing OKC, but I can also see that creating system load issues/costs :)

unquietpirate commented 11 years ago

Braindumps are good!

focalintent said:

So, in a way I think this is one direction OKC was starting to go down when indicating how important a question was to someone. Unfortunately, I don't have a rapidly handy list of all the questions that were absolutely important/vital to me [...]

Yeah, it's been my sense about PAT-OKC from the beginning that, in some ways, both the tool's current and projected functionalities are essentially just taking things that OKC claims to do with their algorithm (e.g. help you avoid someone who has answered a question in a way that is a dealbreaker for you) and simply making that process more transparent and customizable. (Additional evidence for the argument that OKC could do this easily and chooses not to.)

In a general sense, this seems to be sliding towards providing a way to group questions and divide answers for those questions into a flag/don't flag camp[1].

Nods. Or, as Maymay pointed out above, possibly into "Flag - Negative", "Flag - Positive", and "Don't Flag" categories. I agree that keeping it at that level (as opposed to gradiated levels) seems more useful to me.

The data miner in me is curious - what kinds of classification/question groupings would people put together?

I'd want people to be able to make their curated lists public for precisely this reason.

What do those groupings that they put together (and/or the other-people-curated groupings they select and whether for positive/negative flagging?) say about that person?

Yep. ;) In fact, Maymay and a collaborator did something similar with PAT-Fetlife. The data miner in you might find it interesting if you haven't seen it already: http://maybemaimed.com/2012/12/21/tracking-rape-cultures-social-license-to-operate-online/

How obvious would attempts be to gain the system?[3]

Hm. I'm curious about this, too, but I think it goes in a different thread...

Out of curiosity - is the tool centralizing users/answers? I can see that improving lookup times for someone browsing OKC, but I can also see that creating system load issues/costs :)

It used to but, unfortunately, doesn't anymore for precisely that reason. :( (A little more detail here: http://unquietpirate.tumblr.com/post/48754333269/monogamous-cultural-norms-contribute-to-protecting)

focalintent commented 11 years ago

Yeah - I saw the FAADE data dump - ah, if only fetlife had a mechanism for allowing people to give more ordered/organized data about themselves that could be mined, like OKC's questions. (Fetishes, perhaps?)

Also - I appreciated the cultural norms post when I first saw it posted (enjoyed sometimes feels like the wrong word for some of these conversations), thank you for writing that!

Also - back to this particular issue, I don't know how you prefer to do such things for your projects - there's no explicit issue for multiple lists of red flag questions (or for user edit-ability/sharing of such lists). Is that assumed to be rolled up under here, or do you want separate issues for them? (My process-brain insists on getting to type every now and again)

fabacab commented 11 years ago

@focalintent, I don't have any idea how I'd "prefer" to do such things because I've never done such things before. :) I'm intending to organize the issues list for this project in whatever way makes sense to me, but you're also very welcome to add discrete bugs/enhancement/issue reports however you like. I would very much like it if you added whatever you felt needed or could use adding, though. If I feel what you've added is superfluous, I'll probably label them "duplicate" and filter them out. And of course, please fork if and when you have the cycles to work on this yourself. :) Thanks for asking, though.

focalintent commented 11 years ago

Ok - I added #9 for tracking allowing users to define customer question/answer sets and #10 for sharing of said sets. Github doesn't appear to allow for setting dependencies between issues, but #9 and #10 would most likely need to be in place for this (likewise, #9 would be needed for #10 to be sensical). Each of those could be useful as they're completed as stepping stones to this, though (which is why I was thinking about separate issues - in case this ends up being larger to implement than a single update/set of changes).

I started taking a look at the js code. Also - I'm curious what about the server side was the load problem? Data transfer, cpu usage, both - I've only skimmed app engine ? (My day job involves a fair bit of profiling/optimization - and I do that at pretty much all levels, from CPU usage, to database response times, to network usage (both bandwidth and latency), etc... - hard to turn that off).

fabacab commented 11 years ago

@focalintent, that all looks great, thank you! :)

I'm curious what about the server side was the load problem?

Cost was the problem. Google App Engine sets strict resource limits for "free" apps, which my instance at http://okcupid-pat.appspot.com is. So, after the client began getting any significant use, my GAE instane hits its free resource limits and Google forces an error. The only ways to avoid this is to either pay Google (which I can't afford), or run the PAT-OKC server code on a machine we control running off the GAE SDK in development mode (suboptimal solution). Or, of course, to port the server code to some other scalable system.

My day job involves a fair bit of profiling/optimization

Cool! A more pressing performance/optimization issue is referenced in #3. If you have the time and inclination, I'd be very grateful if you took a look at that.

focalintent commented 11 years ago

Ok - I was looking through the server code, and I noticed that it was set to not being threadsafe, which would effectively serialize all your requests through the world - this could end up inflating some of your usage numbers, in some ways. Potentially load/bandwidth/number of outstanding connections. Anyway - I have some personal projects that demand google app engine support, so I may come back and revisit this once I have my way worked around that.

I'll look at #3 - in browser javascript is some of my weakest foo. I'll reference that ticket for any questions I have regarding it.