stacks-archive / app-mining

For App Mining landing page development and App Mining operations.
https://app.co/mining
MIT License
49 stars 16 forks source link

Remove medium reach from blended awareness score (Awario) #171

Closed friedger closed 4 years ago

friedger commented 4 years ago

What is the problem you are seeing? Please describe. Medium posts do receive a reach score that does not (currently) reflect a meaningful reach

How is this problem misaligned with goals of app mining? The reach of a medium post is valued much higher than a post on a established site

What is the explicit recommendation you’re looking to propose? Remove medium score from the awario reach score (until the algorithms have been improved)

Describe your long term considerations in proposing this change. Please include the ways you can predict this recommendation could go wrong and possible ways mitigate.

This is a short term solution. Awario is working on a better evaluation of medium post reach, when finished this should be reviewed.

Additional context Discussion during the app mining call on 30 Oct

cuevasm commented 4 years ago

Haha you beat me to it, I personally support this. Awario is working on a new way to track Medium Reach such that it's more accurate like the others. Until then, I think we should just avoid the noise in the data and not create a situation where people are incentivized to try to game Medium articles into the News/Blog section. I think this is something we could do immediately vs. waiting a month. Easy to run the results with these excluded.

ViniciusBP commented 4 years ago

100% support for this.

friedger commented 4 years ago

Please do it immediately!

cuevasm commented 4 years ago

Not up to me :)

cuevasm commented 4 years ago

Pavel just let me know they have been testing a new algo and it's way better for Medium and Product Hunt too, says it'll probably be out this week. We can look at the results this month and see what we think, easy to run the results both ways.

Walterion01 commented 4 years ago

It is a good proposal, but I think it should cover more sources and not just Medium. Otherwise we will need to make issue for each source. There are many sources that Awario can not get reach value for them, and for some it gets millions, even more than the Medium case. So I think it is better to treat the same for very high suspicious values for Medium or other sources. @friedger you may want to update the proposal for counting other sources too.

sdsantos commented 4 years ago

So far I've only discovered wild results with Medium, but I imaginge it can happen with other blogging platforms. Maybe we should take a quick look to any source with a reach bigger than 50K, for example, and evaluate if it makes sense.

dantrevino commented 4 years ago

I don't agree that we should not wait a month. If we know something is broken, we should be waiting for it to be provably reliable instead of just throwing garbage into the ratings.

cuevasm commented 4 years ago

Dan either I'm confused by your wording or you're confused about the timing. Everyone here is supporting removing Medium right away (aka not putting garbage in the ratings). 'I don't not agree', meaning you do want to wait a month?

What I am saying is that we can simply see the results this month and then decide to exclude Medium because it's 'broken'. The only waiting would be to see how Awario's update helps the situation and we can always add Medium back in.

dantrevino commented 4 years ago

@cuevasm I was actually confused by your suggestion that "We can look at the results this month and see what we think..." ... I thought you were suggesting to not do anything. Ignore my comment.

cuevasm commented 4 years ago

Got it, I see that! No I would personally support removing Medium this month due to the obvious noise and look forward to an imminent update from Awario on it for next month.

Walterion01 commented 4 years ago

@cuevasm will you do it for all the obvious noises or just Medium?

friedger commented 4 years ago

This issue is about medium. Let's discuss each source of noise individually as Awario needs to adapt their algorithms.

Walterion notifications@github.com schrieb am Mo., 4. Nov. 2019, 21:25:

@cuevasm https://github.com/cuevasm do you will to do it for all the obvious noises or just Medium?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/blockstack/app-mining/issues/171?email_source=notifications&email_token=AALBYWLNFDC4VYKK3OUC3N3QSCALNA5CNFSM4JHBCCCKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDAS57Y#issuecomment-549531391, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALBYWPPISVRSFVG7SMYP3TQSCALNANCNFSM4JHBCCCA .

Walterion01 commented 4 years ago

One could say why we do change data coming from a reviewer, as we did not like it for the previous time and we did not wanted for the PH case at all. Although I think this is a reasonable move, but that will be good to have a more clear instruction on when it is ok to change data coming from a reviewer, for future suggestions.

stackatron commented 4 years ago

@cuevasm is this all settled? Is this an audit period change or no? Moving to review.

cuevasm commented 4 years ago

@Walterion1 In my opinion, not including some data is not the same as transforming existing data, as the Analyzer would be doing, which is where I know your comment stems from.

Moving forward with no audit period - will revisit data after Awario pushes changes and add back if their updates make it reliable.

In short, Medium links will NOT be included in this month's score. cc @GinaAbrams for changelog

Walterion01 commented 4 years ago

@cuevasm changing reviewer results is a change anyway. It may be wrong or right, but the view I though PBC has is to put full control of the reviewer and not change its results after the report, as it is their job. e.g., You decided to remove Medium, some of the developers like us had Medium posts, and you are removing it, so the reach will be the same as an app that didn't care to write anything. One reason for it is that Awario still can not get the results of some sources and shows N/A. A reviewer might think it through, and instead of removing it, put 100 for such a post. This way, there is still be a difference and no problem in noise. PS, the Analyzer was removing noises too, it was just more complicated than removing a source of awareness. In my opinion, these are the same; only the size is different as it is reasonable because the effect upvotes had was significate in compare to some blog posts.

friedger commented 4 years ago

The blended awareness score is not about capturing posts written by the app publisher but by news sites. Therefore writing articles on medium, your own blog, etc should not effect the awareness score.

@Walterion if you know news sites that are not captured by Awario I suggest to open an issue for that

Walterion notifications@github.com schrieb am Sa., 9. Nov. 2019, 07:01:

@cuevasm https://github.com/cuevasm changing reviewer results is a change anyway. It may be wrong or right, but the view I though PBC has is to put full control of the reviewer and not change its results after the report, as it is their job. e.g., You decided to remove Medium, some of the developers like us had Medium posts, and you are removing it, so the reach will be the same as an app that didn't care to write anything. One reason for it is that Awario still can not get the results of some sources and shows N/A. A reviewer might think it through, and instead of removing it, put 100 for such a post. This way there is still be a difference and no problem in noise.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/blockstack/app-mining/issues/171?email_source=notifications&email_token=AALBYWI3UZKZOKTECZKMJS3QSZG2ZA5CNFSM4JHBCCCKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDT6WJQ#issuecomment-552069926, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALBYWIOK3NWFNJNEH7QIULQSZG2ZANCNFSM4JHBCCCA .

Walterion01 commented 4 years ago

The blended awareness score is not about capturing posts written by the app publisher but by news sites. Therefore writing articles on medium, your own blog, etc should not effect the awareness score.

@cuevasm is that the case?

@walterion if you know news sites that are not captured by Awario I suggest to open an issue for that

I reported some of them before but I think the Awario team knows that already.

qqnoname commented 4 years ago

Therefore writing articles on medium, your own blog, etc should not effect the awareness score.

But some Medium articles were published by bloggers and media that publish articles only on Medium. Why just do not use claps+followers as a reach score for Medium articles?

friedger commented 4 years ago

Indeed that is what Awario wants to implement as far as I understood.

qqnoname notifications@github.com schrieb am Sa., 9. Nov. 2019, 12:51:

Therefore writing articles on medium, your own blog, etc should not effect the awareness score.

But some Medium articles were published by bloggers and media that publish articles only on Medium. Why just do not use claps+followers as a reach score for Medium articles?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/blockstack/app-mining/issues/171?email_source=notifications&email_token=AALBYWPSPJUD5GMG5HSDB23QS2P47A5CNFSM4JHBCCCKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDUEJBA#issuecomment-552092804, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALBYWOPYPLCRXESDFYQA73QS2P47ANCNFSM4JHBCCCA .

qqnoname commented 4 years ago

But in this case why do not manually update this data for the current results? I guess that there are not many Medium posts.

cuevasm commented 4 years ago

That's the plan @qqnoname. Next, counting claps on Medium articles is not feasible as it would be manual, and it's way too easy to game.

Walterion/Friedger: As for Medium and news/blogs, articles you write there on your own Medium wouldn't count because Awario won't recognize them as a news outlet and they wouldn't be in the News/Blog category. Only more estblished places that publish content there regularly and hit all the right markers for a news/blog would be counted by the AI as 'News/Blog'. Otherwise it just goes under 'Web' and doesn't count anyway.

sdsantos commented 4 years ago

It seems meetup.com mentions are also suffering for the same inflated reach. One could argue that it's not news/blog, but the reach Awario is giving it is also way bigger than the particular meetup group members count.

cuevasm commented 4 years ago

I would support removing Meetup.com too, it's not really a publication, but that's my opinion - anyone feel free to start a thread on it. Awario gets better all the time, but as we discover sources that have off Reach results where it's included in News/Blog we should be flexible about accounting for it to prevent any abuse.

Claps as a reach score is too easy to game, that's not what they are implementing from my understanding.

The idea here is not to self publish reach of course - this should be pretty hard, I would be surprised if someone managed to do this and got it to count in the News/Blog area.

And as far a decision on this, everyone had an opportunity to chime in here and no one opposed, so we proceeded. It's a pretty obvious case of something being wrong with the data vs. a fundamental scoring change.

cuevasm commented 4 years ago

I'm going to close this one and open a new ticket for Meetup.com. I think we should just keep an eye out for websites like Meetup and Medium that are clearly not News/Blog and assess each one and move rapidly to remove. Anyone should feel free to start tickets on sites that seem weird in terms of Reach and go into the News/Blog category. Awario is pretty fast at making these things better!

New ticket: https://github.com/blockstack/app-mining/issues/189