stacks-archive / app-mining

For App Mining landing page development and App Mining operations.
https://app.co/mining
MIT License
49 stars 16 forks source link

Decrease Product Hunt score until removed, and move them to onboarding, not app reviewing #144

Closed pstan26 closed 4 years ago

pstan26 commented 4 years ago

What is the problem you are seeing? Please describe. As Walterion described, Product Hunt may be letting in junk accounts to upvote apps, creating noise in the app mining system. Launching on product hunt is still a good thing to do generally.

How is this problem misaligned with goals of app mining? App mining should have less and less noise in the system so it can be relied upon.

What is the explicit recommendation you’re looking to propose? Reduce the weighting of product hunt by 5% of the total every month, until it’s removed as an App reviewer. Fill the remainder with New Internet Labs score for the time being. Continue requiring product hunt launching a requirement for launching your app in app mining.

Describe your long term considerations in proposing this change. Please include the ways you can predict this recommendation could go wrong and possible ways mitigate. Presumably New Internet Labs will become more robust and new App Reviewers are introduced to replace product hunt as a differentiated app reviewer.

hstove commented 4 years ago

So, 20 months until Product Hunt is removed?

I don't understand why we'd add this long tail of complexity if the end result is just to remove PH.

pstan26 commented 4 years ago

5 months until it’s removed. Reason being is causes less volatility of scoring based on those who earned a good product hunt score recently.

Walterion01 commented 4 years ago

Thanks Patrick, for the recommendation. I think it allows the abusing apps continue what they are doing and give more free time to maybe some new ones repeat it like this month. I recommend taking an effective action, as it is late already and I think it will not be a hard job checking top apps as the ones who want to buy votes, will target the top ranks, otherwise it will not worth it.

cuevasm commented 4 years ago

This is an effective action, removing it any faster is unfair to new apps. Reducing the weight of PH until it's gone is the only logical way of removing it in my opinion. It's not clear to me that PH abuse is so rampant or effective that it's necessary to rip it out immediately.

pstan26 commented 4 years ago

Could be reasonable to reduce the window to 2-4 months, just want to be mindful of what Mitchell just mentioned^.

Walterion01 commented 4 years ago

It seems I didn't explain adequately. I disagree with removing PH completely as it helps to bring a base quality and attention to Blockstack community. Many folks active in PH are product makers, and I think it is the best way to introduce Blockstack more to them. I was suggesting to take any effective action to improve the PH result, not removing it.

pstan26 commented 4 years ago

Right that’s why I suggested making PH required for joining App Mining (part of the onboarding process essentially), but less and less so as an app reviewer.

Walterion01 commented 4 years ago

Although a good launch will help an app to find audiences and serious devs will take it seriously no matter what but there are many low-quality apps on PH too, so I think maybe a binary requirement will not encourage to have a good launch enough.

pstan26 commented 4 years ago

Do you think people would trash their reputation? Possibly. Also maybe this is a place where Product Hunt can have some control over quality?

Walterion01 commented 4 years ago

I don't think PH controls the base quality mainly, (real) voters choose the difference. Although PH will not always promote the best ones, most of the times it will end up with the most-wanted ones.

Although it may help to increase the apps count, it will help to decrease the overall focus on apps quality because there is no reward for that.

My purpose from pursuing these issues was not to binarized a good reviewer, but controlling the outcome, something like @cuevasm is doing with the new good proposal for Awario. And removing will set gamers to walk free for the past months and let them go for the next growing quarter.

muneebm commented 4 years ago

I agree with removing PH as an app reviewer. You can see the stated scoring criteria for PH in below screenshot (from this blog post): image

  1. Only the 4th one above (Popularity and interest in the product) is being evaluated by a PH score now, as the score doesn't include PH team score anymore (removed due to issues described in #55). But for that, we already have Awario, and with the changes proposed by @cuevasm [Blended Awareness #135], I think Awario's score will be a more reliable measure of popularity and interest.
  2. By having two reviewers (PH and Awario) measuring the same dimension (popularity), the final ranking is getting more weighted towards popularity compared to other dimensions.
  3. We have been trying to improve the effectiveness of the PH score for a few months already without a lot of success (#106, #85, #134), reasons being the proprietary nature of their algorithm and the gameability of PH upvotes.
ViniciusBP commented 4 years ago

I agree with removing Product Hunt, but it would be really important to find another reviewer.

Right now, with the Awario changes, Awario will be very close to a binary score. NIL is already a binary score, it's basically a requirement. So the ranking will all be decided on TryMyUI which benefit simple apps. There will be a big incentive to create apps as simple as possible. I don't think Blockstack goal with app mining is to have a hundred Dropbox/Docs/Photos alternatives but if App Mining uses only TryMyUI that is what will be incentivized.

Walterion01 commented 4 years ago

I made a report including Top20 apps and more info. Hope it leads to a useful result for the community. https://github.com/blockstack/app-mining/issues/134#issuecomment-525264113

wilsonbright commented 4 years ago

I agree to remove Product Hunt as a reviewer. I built blocksurvey.org over 20 days for Can't Be Evil 1 Hackathon and launched in PH without knowing how it works as I was new to it. I'm sure this is going to be case of many new app developers like me in the future coming on-board with Blockstack. Having Product Hunt as an app reviewer for mining rewards forces the developers to worry about how to get more upvotes in PH rather than focusing on the product being built. And it will enable creators to invest their time to build a good product.

dantrevino commented 4 years ago

My thoughts on PH are well known (#73), I hope. Sooner is better.

Walterion01 commented 4 years ago

My thoughts on PH are well known (#73), I hope. Sooner is better.

I still think keeping PH and go with something like PH Analyzer (as audit layer) with the control of PBC or community on the algorithm is more beneficial for the overall quality of apps.

cuevasm commented 4 years ago

I think if you have to edit the actual data from an App Reviewer or filter it through a mechanism not designed or even endorsed by them (remember, they said the analyzer was incomplete), they cease to be an App Reviewer in the sense we should all want, which is independent. Thinking ahead to full decentralization, I don't think a community or PBC controlled audit layer would be a good move at all.

It's the same reason I have been pushing so hard not to be auditing Mentions from Awario anymore and why we're moving to a new model to address it. Interpreting existing App Reviewer data a new way that is more fair and less gameable = good. Fundamentally transforming the input data itself before applying it = bad (in the context of decentralization and overall fairness in the long-term).

I'd rather they be removed entirely than for us to start compromising on the definition of an independent App Reviewer.

Walterion01 commented 4 years ago

I see what you are picturing, and I agree that it is a more complicated way. Launching an app is not an easy task, but it keeps the system healthy to have a reviewer who controls the quality and need for an app, and now it is PH. Transforming data is not very welcoming, but filtering? Seems a logical way to me to go to promoting high-quality apps. I think we already do this by filtering out web reach from Awario and that was an excellent choice.

Anyway, I agree with @dantrevino that sooner is better as we probably see more gamers this month too.

The last question would be that removing PH (or making it binary) seems the easiest way, but is it the best way too?

pstan26 commented 4 years ago

(1) I spoke to the Product Hunt team and relayed the community feedback above. Since the next round of App Mining is in 12 days, I think it makes sense to deprecate them after this next round of App Mining. This would be like a dry run month, given we historically have done that. It wouldn't be asap obviously but at least the community wouldn't have abrupt changes 10 days out. Seems like a fair move and it is exciting to remove the noise from the system here.

(2) One other question is do we want to simply have them as a requirement for entering App Mining? They could be like a quality bar and have some discretion to not include apps that are half-developed etc.

cuevasm commented 4 years ago

I like having it as a requirement sometime in the first three months so Miners can select the right time for them. It's a good forcing function to get ready for real eyeballs and people, plus it's great that the Blockstack network is constantly appearing there, this brings everyone new potential users.

It's currently the only App Reviewer that forces us out of our little bubble to consider how we'll bring these apps to a broader market and to get feedback from normal folks that don't do crypto all day (these are the people we need to join us!)

friedger commented 4 years ago

We can do the dry run this month and calculate the score for this round without PH. Doing the calculations and seeing the effect should be enough.

Furthermore, 10 days warning should be enough. What kind of change of behavior do you expect from this change?

Question 2) could be discussed in a new issue..

cuevasm commented 4 years ago

Haha @friedger you're always ready for the quick change, most folks have repeatedly expressed that they need a little more time. :)

wilsonbright commented 4 years ago

@pstan26 @cuevasm I feel if we delay it, wouldn't we be still rewarding the apps who have gamed PH? I remember Awario calling out a few apps who took advantage of the system few months back and removing mentions. Should a similar approach be followed for PH for top 50 apps this month? What are your thoughts on apps who started at 100 votes on launch and today they stand at 1000+ votes? I feel it will affect other app teams.

cuevasm commented 4 years ago

I think it's a lot less provable that Product Hunt has been gamed at any notable scale. Yes, the Analyzer that was built is compelling, but PH themselves pointed out flaws in it and I don't think it can safely be relied on to say 'yes, certainly these were gamed'. Whereas, with Awario, it's pretty plain and simple to see clearly what was happening and have those removed. I think we should avoid trying to edit PH ourselves without complete knowledge of their systems and just ride out the final time window. Also, I'm just less inclined to make a change to how we are counting PH in the last month - it's probably a lot of wasted effort since we're not planning to try and keep them on in that capacity.

ViniciusBP commented 4 years ago

There are several changes being made at the same time in app mining and this should totally change the dynamics of App Mining, so it would be great to understand the effect of all changes together.

Looking at all proposals, the next scenario according to my understanding is:

1) TryMyUI will be bi-monthly 2) Awario will be close to a binary score 3) PH removal

In this case, how would the new apps ranking work? Would they only be evaluated by TryMyUI? From previous data, it is clear that the luck factor is decisive on TryMyUI , the same app can average 70 or 88, depending on the quality of the testers.

Considering this scenario, I predict that the result will become random, but with only bi-monthly changes. I think it is very important to look for a new non-binary Reviewer, so there is some incentive besides improving usability and getting lucky. Is Blockstack planning any new reviewer soon?

It would be interesting to run DryRun for more than a month, to also see the effect of consistency on TryMyUI results, as this app reviewer will be pretty much the only one differentiating most apps.

cuevasm commented 4 years ago

Awario isn't becoming a binary reviewer. It's just emphasizing/incentivizing a different part of one's online presence by forcing you to differentiate in a particular area vs. gaming social media. Blended Awareness: https://github.com/blockstack/app-mining/issues/135#issuecomment-523136444

friedger commented 4 years ago

Her is the dry run for august: https://docs.google.com/spreadsheets/d/1F6FMevv0BapSSNqKqRJPXmcRJj_-gT0pZqb-gqqnlnA/edit#gid=97800140

Screenshot from 2019-09-19 08-53-22

and an analysis of the impace: https://docs.google.com/spreadsheets/d/1c5M27Gbz9ZwE5GE1HydD6OUV0__Qt6m2yvKtkAa36hk/edit#gid=0

Top winners: Screenshot from 2019-09-19 08-51-15

Top loosers: Screenshot from 2019-09-19 08-51-44

Is that a fairer, better result?

Walterion01 commented 4 years ago

@friedger if this is correct simplest apps will get the best TMUI results, then it reasonable to propose making calculator apps as they are way simpler and therefor getting TMUI result will be so much simpler, and for Awario we can have some fun skin to get social network attention.

friedger commented 4 years ago

@Walterion1 How do you read this from the dry run results?

Walterion01 commented 4 years ago

I can not read the full result as the links seem private shares and need to sign in. As your screenshots, it seems the most simple apps in the mining got the best result.

friedger commented 4 years ago

@Walterion1 The sheets have been shared publicly now: https://docs.google.com/spreadsheets/d/1F6FMevv0BapSSNqKqRJPXmcRJj_-gT0pZqb-gqqnlnA/edit#gid=97800140?usp=sharing

https://docs.google.com/spreadsheets/d/1c5M27Gbz9ZwE5GE1HydD6OUV0__Qt6m2yvKtkAa36hk/edit?usp=sharing

To me, it looks like that awario and trymui do a good mix.

Walterion01 commented 4 years ago

@friedger Thank you. Looking at the top 5 shows that more simple ones have a much better chance.

friedger commented 4 years ago

@Walterion1 The trymui score of the top 5 apps (dry run) ranges from 0.6734349943 to 1.155509493 That doesn't look like that trymui is the only factor, and that does not convince me that the algorithms prefer simple apps (under the assumption that trymui prefers simple apps).

friedger commented 4 years ago

Here is the dry run for September using the final score of August Dry Run as "score last round", i.e. there is no impact of a PH score for the last two months.

Top 20: Screenshot from 2019-09-21 11-12-47

https://docs.google.com/spreadsheets/d/1F6FMevv0BapSSNqKqRJPXmcRJj_-gT0pZqb-gqqnlnA/edit#gid=853566369?usp=sharing

Top Winners: Screenshot from 2019-09-21 11-02-34

Top Loosers: Screenshot from 2019-09-21 11-03-31

friedger commented 4 years ago

With two months of dry run data we should be able to make a decision on this issue NOW :-)

friedger commented 4 years ago

Please add lable dryrun (completed) (see #157 )

dantrevino commented 4 years ago

@Walterion1 simple apps winning has always been the case. Changing PH will not change that fact.

dantrevino commented 4 years ago

I think it's a lot less provable that Product Hunt has been gamed at any notable scale. Yes, the Analyzer that was built is compelling, but PH themselves pointed out flaws in it and I don't think it can safely be relied on to say 'yes, certainly these were gamed'.

You're probably right, but my problem with that is that we're relying on some secret sauce the PH has decided is better. They might be right. Might not.

Walterion01 commented 4 years ago

@Walterion1 simple apps winning has always been the case. Changing PH will not change that fact.

It could prevent that a little more.

@cuevasm I think with this month result you can see it a little better just like Awario. And if we try to improve the situation and not remove it completely, it may worth the try. Think about it that without a reviewer that controls the quality and need for an app, what simple apps can submit, get high TMUI and get high results. They do not need to improve the apps and will probably help to weaken the system against spam-apps as you can see this month too (because PH has no control over it). Sure they will increase the apps count but not the quality and real awareness Blockstack needs to grow.

muneebm commented 4 years ago

Think about it that without a reviewer that controls the quality and need for an app, what simple apps can submit, get high TMUI and get high results. They do not need to improve the apps and will probably help to weaken the system against spam-apps as you can see this month too (because PH has no control over it).

I think PH doesn't control the quality of the apps, apps with just a landing page with no real functionality sometimes gets more upvotes than others. As @dantrevino mentioned, even with PH, simple apps getting the highest ranks has always been the case. I think, keeping PH as an app reviewer will not help with any of the things that are mentioned here. I agree that we need to find another reviewer to replace PH, but continuing to include PH score is not going to help better the situation in any way.

njordhov commented 4 years ago

Think about it that without a reviewer that controls the quality and need for an app, what simple apps can submit, get high TMUI and get high results. They do not need to improve the apps and will probably help to weaken the system against spam-apps as you can see this month too (because PH has no control over it).

I think PH doesn't control the quality of the apps, apps with just a landing page with no real functionality sometimes gets more upvotes than others.

Many on Product Hunt vote without even checking out the app, evident by comparing upvotes with registered users on Blockstats. Product Hunt upvotes favors appealing claims whether or not the product actually delivers.

Proposal #155 addresses this issue by suggesting adding a reviewer for truth in marketing, which will held developers to what is promised and proclaimed on Product Hunt and elsewhere.

GinaAbrams commented 4 years ago

Product Hunt scores will count for the October rankings. We're going to proceed with a dry run removing Product Hunt at the same time and share the results.

friedger commented 4 years ago

Two months dry run data is not enough?

Gina Abrams notifications@github.com schrieb am Di., 1. Okt. 2019, 18:22:

Product Hunt scores will count for the October rankings. We're going to proceed with a dry run removing Product Hunt at the same time and share the results.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/blockstack/app-mining/issues/144?email_source=notifications&email_token=AALBYWJFOWUCSOIW6WMMI3TQMN2NRA5CNFSM4INY52XKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAB3R7A#issuecomment-537114876, or mute the thread https://github.com/notifications/unsubscribe-auth/AALBYWP4KXRX5WIFFOWVKALQMN2NRANCNFSM4INY52XA .

Walterion01 commented 4 years ago

I update the #134 issue.

wilsonbright commented 4 years ago

For the betterment of the community, a quicker decision would help. We have two months of dry run data. Can we re-consider the PH process for October's App mining?

dantrevino commented 4 years ago

+1 to @wilsonbright's last comment. @pstan26 @GinaAbrams @cuevasm it was 6 months ago that I proposed removing PH (#73) at the time we wanted to take a more considered approach and see if we could improve the situation. Its become apparent over the last 6 months that this issue scales with the growth of the developer community.

6 months is more than long enough to have tried to make changes to PH work.

I urge the immediate suspension of PH as an App Reviewer. Effective October 1.

pstan26 commented 4 years ago

Hey all, as noted in the updates this month, we're doing a dry-run of scoring without Product Hunt. This is a necessary step to take before deprecation could take place. As we've been working through all the feedback here, one of the steps we took was to engage the Product Hunt team to understand what they had to say about the credible upvotes and so forth. In the course of those discussions, we've learned that they're not able to reveal any more to us in terms of the algorithm or provide further filtering/analysis on the data. Given the lack of additional data and that we can't necessarily rely on analysis of our own given our insight on the data is inherently limited, we think it's best to move on from Product Hunt as a regular App Reviewer. We are exploring some ways we could make them a part of launching apps, as there is value in the visibility and the forcing function of exposing Blockstack apps to this community sooner than later.

Additionally, they let us know that the scoring load for them approaching maximum capacity as it is. They are interested in figuring out ways to work with everyone such that it is a good fit for them - it seems the majority find the current model of their reveiwership is too gameable to be acceptable.

All these things in mind, again, we're taking the steps needed to phase Product Hunt out.

Walterion01 commented 4 years ago

@pstan26 I will be glad to see improvement in this matter, thank you.

I think you are missing an important note here, there are too obvious fake votes (like false tweets before), and there is no need for the PHA to recognize that (check #134 samples). These values will skew the results profoundly (as happed before), so an app like Dclouds can quickly get a very high score (1.2) this month and even if we remove PH for the next month, it will use the previous month's score to keep corrupting the results and will take months for it to spend that high score (because of averaging with last round), and they can recharge that with some sponsored news by the budget provided though these fake upvotes.

We are maybe going in the right direction here, but I think you are totally missing the meantime and the considerable impact it put on us along the way.

friedger commented 4 years ago

we're doing a dry-run of scoring without Product Hunt. This is a necessary step to take before deprecation could take place.

@pstan26 Dry run data for the last two month is already there! Hence, deprecation can take place NOW.

x5engine commented 4 years ago

Yesssss amazing!