TeamNewPipe / NewPipe

A libre lightweight streaming front-end for Android.
https://newpipe.net
GNU General Public License v3.0
31.82k stars 3.08k forks source link

Add integration with returnyoutubedislike.com #7469

Closed m00nwtchr closed 2 years ago

m00nwtchr commented 3 years ago

Checklist

What feature do you want?

Add integration with https://returnyoutubedislike.com/

Why do you want this feature?

To return YouTube dislikes

I may be able to implement this myself but I'll have to take a look at their API and get familiarised with NewPipe's codebase

triallax commented 3 years ago

We already fixed that, see https://github.com/TeamNewPipe/NewPipeExtractor/pull/753.

m00nwtchr commented 3 years ago

No, the point is to still show a dislike count, not just fix parsing (and their method of doing that is much better than just calculating an estimate, especially with the planned features)

triallax commented 3 years ago

the point is to still show a dislike count

I don't get it, the PR does that too.

just calculating an estimate

It's actually pretty accurate.

especially with the planned features

Yeah, that's a good point.

Anyway, I was not dismissing your suggestion, just saying that the bug has been fixed for now.

m00nwtchr commented 3 years ago

ah, yeah i only realised that that PR also added an estimated amount after i wrote that (hence the edits)

eternal-sorrow commented 2 years ago

If such an integration is added, it should be disabled by default.

Anarios commented 2 years ago

Hi, I'm owner of returnyoutubedislike.com If there are plans for integration - I'd be happy to assist. Our database has over 200mil videos saved with stats before the dislikes were disabled, and a huge dataset (69tb) from Archive Team is coming as well.

Other than that we're collecting likes and dislikes from 1.5mil extension users - and using those to estimate actual dislike numbers (not perfect - but the more users we get - the closer it becomes to true numbers)

pgamerx commented 2 years ago

(69tb)

Nice

Zoobdude commented 2 years ago

Hi, I'm owner of returnyoutubedislike.com If there are plans for integration - I'd be happy to assist. Our database has over 200mil videos saved with stats before the dislikes were disabled, and a huge dataset (69tb) from Archive Team is coming as well.

Other than that we're collecting likes and dislikes from 1.5mil extension users - and using those to estimate actual dislike numbers (not perfect - but the more users we get - the closer it becomes to true numbers)

And surely another integration would make them more accurate.

Anarios commented 2 years ago

And surely another integration would make them more accurate.

yes, that's the point of integrations :)

mcmurphy8097234789 commented 2 years ago

If such an integration is added, it should be disabled by default.

I am inclined to agree. No shade being thrown at @Anarios (thank you for your work), but currently the extension does rely on user interaction/data to generate accurate counts. Under the current security FAQ offered by the project, it is not clear for how long the hashed IP addresses or the user IDs are stored. Finally, the server backend code has not been open-sourced yet. Again, I appreciate the work Anarios has done (and he does have the tentative support by names such as LTT), and I think this type of solution is the fix needed for the dislike problem. However, until the concerns I mentioned are rectified, I would absolutely support the extension being disabled by default should it be included in NewPipe.

On that note, is the NewPipe team considering implementing it at this point? Or no?

Anarios commented 2 years ago

it is not clear for how long the hashed IP addresses or the user IDs are stored

Well, indefinitely. Once you cast your vote - the vote is stored forever.

However, until the concerns I mentioned are rectified

I honestly do not understand why you (and others) think that publishing backend code improves privacy concerns by any means. The code running on server could be different from the published code, there could be a reverse proxy between server and client doing any kind of things with data passed.

I think, by default, you should assume that none of the data sent to server is safe (and therefore we send as little data as possible).

I would absolutely support the extension being disabled by default should it be included in NewPipe.

As an in-between option - viewing dislikes can be enabled by default (since no userId is generated for this), and counting dislikes and likes could be an opt-in option. But the user has to see and be warned that his votes are not counted unless he enables is. (otherwise noone will even know that his votes do not affect anything).

Midou36O commented 2 years ago

I honestly do not understand why you (and others) think that publishing backend code improves privacy concerns by any means. The code running on server could be different from the published code, there could be a reverse proxy between server and client doing any kind of things with data passed.

It improves privacy concerns, say someone doesn't trust your server, they could run their own instance of that server, that they manage themselves, thus they can trust themselves and not some other party.

As an in-between option - viewing dislikes can be enabled by default (since no userId is generated for this), and counting dislikes and likes could be an opt-in option.

Enabling the ability to view dislikes by default sends the user's public IP to your server without their consent (not that it's such a big deal, but some people prefer to have the freedom to choose if they want to use this or not, thus the option of making it off by default being more logical)

Anarios commented 2 years ago

they could run their own instance of that server

And get data where? even if I share dumps - that data is outdated quickly. The whole idea behind this extension is that it should be centralised, and that all votes should be collected in one place, eliminating fragmentation of userbases.

That's why I'm sharing the API for free.

Midou36O commented 2 years ago

And get data where? even if I share dumps - that data is outdated quickly. The whole idea behind this extension is that it should be centralised, and that all votes should be collected in one place, eliminating fragmentation of userbases.

And if something happens to that server? (Say for example that an outage happens) No one would be able to pull youtube dislikes.

Anarios commented 2 years ago

And get data where? even if I share dumps - that data is outdated quickly. The whole idea behind this extension is that it should be centralised, and that all votes should be collected in one place, eliminating fragmentation of userbases.

And if something happens to that server? (Say for example that an outage happens) No one would be able to pull youtube dislikes.

that's why there are DB backups and backup servers.

FireMasterK commented 2 years ago

I honestly do not understand why you (and others) think that publishing backend code improves privacy concerns by any means. The code running on server could be different from the published code, there could be a reverse proxy between server and client doing any kind of things with data passed.

I honestly don't think the backend code matters as much, but what matters more is the database. For example, in SponsorBlock, we can have an "aggregator" server, which queries the database for lookups, and sends like/dislike contributions upstream to the main server. A project which does this - https://github.com/mchangrh/sb-mirror

And get data where? even if I share dumps - that data is outdated quickly.

This is a problem to solve, SponsorBlock has real-time dumps. You could work something out, I'm sure it's possible.

Anarios commented 2 years ago

SponsorBlock has real-time dumps.

Well, I do not want to share real-time dumps, to not facilitate copycats using them and create userbase fragmentation.

FireMasterK commented 2 years ago

Well, I do not want to share real-time dumps, to not facilitate copycats using them and create userbase fragmentation.

I think Ajay handled that well, by licensing it under the CC BY-NC-SA 4.0.

pgamerx commented 2 years ago

SponsorBlock has real-time dumps.

Well, I do not want to share real-time dumps, to not facilitate copycats using them and create userbase fragmentation.

You don't need to facilitate copycats, they will always exist no matter what you do and that's why this cannot be an adequate response/excuse for not sharing real-time dumps.

CopyCats can easily make their API, use multiple proxies which are available for free and use it to make requests to your API, this will also prevent the IP-based rate-limiting from triggering and causing them to keep making requests for as long as they want.

I honestly do not understand why you (and others) think that publishing backend code improves privacy concerns by any means. The code running on the server could be different from the published code, there could be a reverse proxy between server and client doing any kind of things with data passed.

I think multiple resources explain the same, I am going to quote some of them here if you don't mind.

Start

"The philosophy of open-source is that everyone can contribute to making a better project. This means it’s more accessible and more reliable as, sometimes, you can have tens of experts working on one project, guaranteeing quality and security. And the fact that one person or team oversees everything means there won’t be discrepancies or clashes with the overall vision. That ensures that you have a great product without proprietary restrictions."

"There is a variety of good reasons to release something under an open-source license, from “more perspectives make better software” to “establishing a standard.” It is important to build a sustainable project to consider your reasons for publishing as open-source and use these as guidance for decision making." -Google

"The advantages of having source code open extends not just to software that is being attacked, but also extends to vulnerability assessment scanners. Vulnerability assessment scanners intentionally look for vulnerabilities in configured systems. A recent Network Computing evaluation found that the best scanner (which, among other things, found the most legitimate vulnerabilities) was Nessus, an open-source scanner [Forristal 2001]."

"Open source code is often higher quality. A piece of software created by a team of developers can be of lower quality than that developed by thousands of developers from all over the world with experience in different technologies, industries, and projects. And bugs in open source software are identified very quickly as the code is being constantly reviewed by multiple developers."

"You should use open source software for application development because it is more secure. The community promptly finds and reports security flaws which the software owner usually fixes right away."

End

Anyway, it's your software, and it's your wish whether to make it open-sourced or not, what I believe, and what a lot many believe is that open-sourcing will help everyone (you, consumers, fellow developers, future sponsors(?)).

Also, don't mind me asking but what is your reason for not wanting to open-source the backend?

TheFrenchGhosty commented 2 years ago

Okay, so I'll copy/rewrite here what I sent when someone suggested its integration in Invidious (note: I'm not trying to be nice, so my comments might feel "rough"):

There are multiples issues with returnyoutubedislike:

What I mean with the last point is that most of the developer comments show that the developer like/want to show off that he's "the guy that did it":

Translates to: I am the only one who have updated data

Translates to: It's MY service, I don't owe you anything

(that's how I read it at least)

It's not the only place where you can read messages like that, even comments posted here are basically the same thing:

The reason for that can be various: ego (so much hype around it), wanting donations, wanting sponsors (I mean, look at the random sketchy VPN sponsor (a VPN that literally logs IP addresses) that was added recently, even it was pointed out by LTT).

So in conclusion: The project isn't trustworthy enough, to be implemented in any of the privacy focused YouTube frontend (Invidious, NewPipe, FreeTube, Piped...)

Anarios commented 2 years ago

Translates to: It's MY service, I don't owe you anything

Well, yes. I agree on this point

look at the random sketchy VPN sponsor that was added recently

It exists for 10 years, has over a mil installs in GooglePlay and a score of 4.8, I thought it's OK to take 100$ from them. If I saw anything sketchy - they would be deleted.

It's backend is closed source (this on its own is a dealbreaker)

I'm still waiting for one good explanation how opening backend code would improve data privacy for users. The server is a black-box anyways - regardless if it's open source of closed-source.

The developer seems to be entitled as "deserving" to be the only one having an extension that do that

Not sure what you mean here. I think that my work belongs to me - that's all.

It's up to users and services to choose whether to use the API\extension, but I don't feel like I owe anyone anything, other than being honest about what's done with their data and what's being stored and how their data is being used.

I don't feel like I owe sharing the realtime DB or sharing backend code right now, if it causes me more problems than benefits.

Anarios commented 2 years ago

@TheFrenchGhosty, don't get me wrong - I see the reasons for your criticism, I just don't see painless options to resolve it.

TheFrenchGhosty commented 2 years ago

It exists for 10 years, has over a mil installs in GooglePlay and a score of 4.8, I thought it's OK to take 100$ from them. If I saw anything sketchy - they would be deleted.

It logs IP, logs timestamp, it's a shady no-name company. 1 million download is nothing in Google Play.

Also, 100$ to promote them: you got basically scammed at this price point.

I'm still waiting for one good explanation how opening backend code would improve data privacy for users. The server is a black-box anyways - regardless if it's open source of closed-source.

It's mainly about trust and transparency.

If your backend is closed source, it means you have something to hide.

It also allows others to host their own instance (something you seem to be against... because of your "IT'S MINE!" attitude).

Not sure what you mean here. I think that my work belongs to me - that's all.

Going FOSS doesn't change that. Going FOSS is about sharing your work to the world, it's not about protecting your work: 99% of license protect it.

It's up to users and services to choose whether to use the API\extension, but I don't feel like I owe anyone anything,

So it's up to users to decide if they want to give their whole watch history to an unknown third party? That's just sketchy.

other than being honest about what's done with their data and what's being stored and how their data is being used.

We can't know what you do with the data, because the backend is closed source.

I don't feel like I owe sharing the realtime DB or sharing backend code right now, if it causes me more problems than benefits.

So, you'll use the work of the archiveteam for free, but not share anything in return?

Just for this reason you owe sharing the DB.

I don't feel like I owe sharing the realtime DB or sharing backend code right now, if it causes me more problems than benefits.

How so? You have dozens of developers that will come and help you, I don't get what problem this will cause, unless you have to remove some sketchy stuff.

Anarios commented 2 years ago

Also, don't mind me asking but what is your reason for not wanting to open-source the backend?

  1. Simple laziness. It does include some additional work on my side to opensource it, don't forget - it was always in a half-proof-of-concept state
  2. Vote spoofing concerns. Users are not authorized (other than by a random-generated ID and a micro-proof-of-work task) - so it's not difficult to submit a bunch of votes and alter vote results, so I prefer to keep this part closed for now. Separating it into a module and sharing everything except this - could be possible as well, but it's half of code-base, someone would say that disclosing half of backend code is still not good enough.
  3. Maintaining an opensource project is lots of work.
Anarios commented 2 years ago

Also, 100$ to promote them: you got basically scammed at this price point.

It's a standard Patreon perk, anyone donating over 100$ is mentioned in sponsors list. (Unless I find anything sketchy about them)

Going FOSS is about sharing your work to the world, it's not about protecting your work: 99% of license protect it.

would people spamming the API with millions of requests right now respect the terms of license? Will I have time and ability to protect my rights and enforce license terms?

So it's up to users to decide if they want to give their whole watch history to an unknown third party?

They see what's being associated with them - a random ID. If you think that a random ID paired with watch history is too valuable to be shared - then yes, you probably shouldn't use the extension. Or use it in incognito mode only, reinstalling it from time to time, and also use a VPN - there is no way around it.

We can't know what you do with the data, because the backend is closed source.

You wouldn't know what I do with data even if backend was open-sourced.

So, you'll use the work of the archiveteam for free, but not share anything in return?

I share the API with anyone for free. I also participated in crawl and advertised ArchiveTeam. I don't use any of their work yet, by the way. And I don't feel like there is any significant difference whether I use it or not - because I stored every video that is at least somewhat popular. I also provided archive team with a significant amount of video IDs for the crawl, so I'm not even sure who gains more here.

By the way - ArchiveTeam's server isn't opensource as well (the part that receives data from all crawlers) (At least to my knowledge, I might be mistaken)

Daasin commented 2 years ago

Agree on this integration being nice for users, they seem to be open to working with Vanced as mentioned in Linus' Video (LTT) so I dont see why they couldn't work things out with NewPipe. Especially considering that both projects appear to be under GPL 3.0 license

https://github.com/Anarios/return-youtube-dislike/issues/392

Anarios commented 2 years ago

I dont see why they couldn't work things out with NewPipe.

Well, NewPipe have privacy concerns which I can't resolve at the moment. Once the backend is open source - they might reconsider it.

Another possible solution for NewPipe (and anyone else having privacy concerns) - their own caching proxy serving multiple clients - that way no private data of users would be exposed to RYD servers.

(just contact me to raise request limit rates for proxy, as currently there is a 10 000 per day limit)

rewrib commented 2 years ago

3. Maintaining an opensource project is lots of work.

Shouldn't it be easier to maintain? Because now you can delegate never-ending work like bug/exploit hunting + fixing towards anyone else. This project can always be improved and now it would continue to improve, even if you quit completely or your money runs out.

When it comes to centralization vs decentralization and privacy, LBRY might be a good example on how these things can be balanced.

wavetro commented 2 years ago

I know this is a shot in the dark but @Anarios have you reconsidered your stance on addressing the closed-source backend of RYD? The only Andrioid YT client that implemented your database (Vanced) recently shut down and everyone seems to be flocking here to Newpipe

Atemu commented 2 years ago

An open-source back-end won't help much of anthing. At the end of the day, we still have to trust that the operator really is using that code and there's no way to prove that.

The only thing we can expect them to implement is a privacy-preserving mechanism such as K-anonymity like Sponsorblock did.

FYI, https://github.com/polymorphicshade/NewPipe has RYD integrated.

opusforlife2 commented 2 years ago

Closing this for now. If the backend is ever open sourced, please open a new issue for this to be reconsidered.