StevenBlack / hosts

🔒 Consolidating and extending hosts files from several well-curated sources. Optionally pick extensions for porn, social media, and other categories.
MIT License
26.61k stars 2.21k forks source link

split.io is in hosts file #1345

Closed sudarpo closed 4 years ago

sudarpo commented 4 years ago

Hi there, May we know why split.io is in hosts file?
Aren't they kind of like a CI/CD provider?
https://www.split.io/product/faq/

acloud.guru use split.io. So I was curious why I cant load the courses page at https://learn.acloud.guru/browse.
After I remove below hosts, I can load the page. Thanks.

# [split.io]
0.0.0.0 events.split.io
0.0.0.0 sdk.split.io
welcome[bot] commented 4 years ago

Hello! Thank you for opening your first issue in this repo. It’s people like you who make these host files better!

StevenBlack commented 4 years ago

Hi @sudarpo, the split.io domains come to us from AdAway.

@jawz101 can you please have a look, and advise? Thanks.

@sudarpo you should know, it doesn't matter what the service does. What matters is, does the service track or trade on its users? Is it a vehicle for adware and/or malware? If yes, then we block it.

Let's see what @jawz101 says.

jawz101 commented 4 years ago

I'm a bit reluctant to remove it. They have an Android sdk, virustotal scans show that Android apps directly communicate with those domains, their site lists integrations with dozens of advertisers who do segmentation and attribution...

I get the client integration point - testing feature rollouts and such, but I also see it as an end user who doesn't like to be experimented with, classified, or targeted with a discount because they fell into a certain bucket.

It's simply not something I'm interested in removing because I'm an end user and those who want it can whitelist it for themselves.

But, it's a public list and I'll do whatever most people think is right

StevenBlack commented 4 years ago

Good call, @jawz101 and, given your findings, I agree.

Thanks for raising this issue @sudarpo — we'll keep listing split.io.

Closing.

sudarpo commented 4 years ago

Thanks for the explanation guys. Appreciate the replies.

bengalih commented 4 years ago

I'm here because I just spent 2 hours trying to track down why the Sonic Mobile App kept crashing on my android phone when entering the place order dialog.

Turns out it was because split.io was blocked.

I use AdAway on my phone and even with logging DNS requests for some reason split.io never even showed up as a request in the DNS logs. I actually had to disable all monitoring (turn off hosts file) for the request to even show up, and even then I still needed to weed through dozens of entries.

I don't think it is right to expect users to spend so much time trying to determine how to get totally legitimate apps working. I don't know what Sonic is using split.io for...maybe they are collecting metrics, but without it the site fails to work. We need to stop blaming the tools in these cases.

Isn't there some way to identify sites like this and break them up into a secondary list for the truly paranoid who want to spend hours of their life tracing down queries just so they can order a milkshake for their kids?

@jawz101 please reconsider what this really means for most end users.

thanks

p.s. I'm also more than happy to work with someone from the AdAway team who can tell me why split.io wouldn't show up in "Log DNS Requests" unless ad-blocking was disabled.

StevenBlack commented 4 years ago

Ben @bengalih I hear you.

There is something you should fully understand. There is no such thing as "totally legitimate apps" here, based on what these apps nominally do.

The only relevant question is, does this thing track and/or trade its users, or is this thing a vector for adware or malware and other heinous things.

Some people tend to believe their hometown newspaper is "legit". The fact that it's a hometown newspaper is distinct from the fact that the people operating the newspaper, and their media partners, are being scumbags with their online practices.

So please understand, people use these blockers because they don't want to be traded or assaulted, under any circumstances. This isn't our problem; it's split.io's problem.

bengalih commented 4 years ago

@StevenBlack I hear you too. But is it really split.io's problem? They offer a tool. Should we not allow anyone to buy a knife even though it can be used to kill? Maybe...but I can also cut your heart out with a spoon. I know this is a deep philosophical argument and I am unlikely to change your mind...and maybe I don't even want to (i.e. perhaps I agree). However, look at the problem I had: 1) Sonic mobile app (i.e. this is like me telling you McDonald's mobile app depending on the region you live...we probably have more Sonics here than McDs) operates but crashes when trying to place order. 2) Reason is because split.io is blocked by AdAway list 3) AdAway is application on phone which is actually doing the blocking based on their list, but still fails to log split.io as a DNS request. 4) AdAway only shows split.io request if blocking is turned off entirely (makes no sense and took 1.5 hours to even discover that). 5) Even after sorting through above, still takes 30 mins of trial and error to track down offending site.

I'm one of those people who don't want to be traded/assaulted, but there is no such thing as "any circumstances" unless you basically live off the grid. I highly doubt the majority of users of these lists believe they are using electronic devices and not compromising at some point. I stand by my suggestion that there should be some way to differentiate these things into additional levels. This makes choosing your level of protection and also troubleshooting these issues better. I'm sure there are dozens if not hundreds of sites like split.io which are neutral tools which can be utilized in nefarious ways. Why can't we get this to another list which we can easily toggle and then if we have success it is easy to narrow down the culprit.

Thanks for listening.

(Honestly, I think much of this is AdAway's problem, I know you are just maintaining a list and even with split.io remaining I can think of features that would have turned 120 minutes into probably 5 minutes and I will be happy to discuss these with @jawz101 if he would like to.)

jawz101 commented 4 years ago

I maintain the Adaway host file but the app is maintained by @PerfectSlayer

You may want to open an issue on the Adaway app repo to see why a domain isn't logged when there is traffic from a particular app does go through it.

Since you did comment I'm re-investigating the domain. If it feels more like general telemetry I'll take it out. I'm not a fan of blocking things such as Firefox's telemetry, for comparison, because it's clearly used to measure technical bottlenecks and feature use and the effort goes directly back into the product.

It's just one of those gotchas. When most apps function without breakage, it's difficult to find one app that does break when they must use this sdk in a part of their app's code to measure something that, in the act of observation, actually interferes with the functionality of that thing. That is one thing that is odd from my perspective.

The other thing that bugs me is they advertise integrations with other products that we also block. https://www.split.io/product/integrations/

Google analytics, mparticle,appdynamics, new relic, sumo logic, and segment are names I am familiar with I'm sure I've added tracker definitions to the Exodus Privacy Project for them.

Jira and Slack are familiar but they're in a legitimate bucket. It's just weeding things out properly. I consider these convos regardless because your concerns are valid and I wouldn't have seen the whole picture without the feedback.

After looking at the site for the past 10 minutes you have me convinced. And then I look at the customer testimonials https://www.split.io/customers/

These are stories straight from their customers. "I upload a bunch of contact lists, give each of them a score on how much money we can get out of them, and then give out coupons to those who have a lot of cash to give us. They go through less clicks too." That's the kinda stuff I would opt out of.

Twilio

Understanding the impact across segments – Gain a deeper understanding of user behavior, by testing a feature across multiple users or customer segments. This can be done per customer account, or per individual user. The segments are typically different use cases including different geographies, different code language, and level of interest in the product.

Thred UP

You can pass attributes into Split for targeting, so we passed in some of our machine-learning scores, and then that would dictate what promotion a customer would see.

Vida

Vida Health is a digital health company offering a full suite of personalized health programs and one-on-one coaching and therapy.

Vida’s horizontal platform integrates leading behavioral science and data science to drive health outcomes for members, including those with chronic physical and behavioral health conditions. Vida leverages Split’s feature flagging and deep experimentation capabilities to build and deliver innovation faster with less risk....Vida plans to pull Split impression data into Google BigQuery where their data scientists conduct broader analysis for building new customer features and engineers train machine learning models.

GoDaddy

The team is exploring means to understand cart abandonment better and weave in experiments that can lead to a reduction in cart abandonment – a project that would have a transformational impact on the company’s top line growth....jörn envisions a future where 100s of experiments are run on the Split platform every week, helping GoDaddy make smarter product decisions every time.

GoodRx

The GoodRx team wanted to implement more sophisticated product experiments, and they liked Split’s advanced segmentation capabilities such as assigning custom attributes for more granular user targeting. For example, GoodRx can segment users based on dimensions such as the user’s city and state so they can serve up feature variants of an experiment based on user location or IP address. GoodRx also takes advantage of Split’s whitelist functionality to make internal testing of backend systems much more manageable.

Envoy

Some Envoy customers have very large contact lists that need to be uploaded. Loading these large files can skew performance results during the initial evaluation. With Split, Envoy product managers are able to easily define the target segments, giving them better control when evaluating the impact of the change on performance and infrastructure. By leveraging segments, they are able to avoid customers that fall into outlier categories during the early rollout phases.

Product managers at Envoy are laser-focused on achieving aggressive revenue growth goals. Split is helping Envoy achieve business objectives faster than ever before as product managers now launch experiments every two weeks. They like how easy it is to ‘bucket’ customers and track what treatments have been exposed to each customer so they can quickly analyze results. Many of Envoy’s experiments focus on increasing user activations since these drive revenue flow.

Product managers use their splits in creative ways to increase revenue. For example, they are able to split based on license entitlements, making it easy to test new features on a subset of customers in each of their pricing tiers. They run promotions that temporarily turn on a top-tier feature for a customer who is on a lower tiered plan to encourage them to upgrade. This process also helps them build a close relationship with their customer base as they track how new features change behavior, giving them confidence that scaling to 100% will be successful.

bengalih commented 4 years ago

After looking at the site for the past 10 minutes you have me convinced. And then I look at the customer testimonials https://www.split.io/customers/

Thanks for the feedback. In the end though I'm not sure what you are opting for: keeping it, or leaving it.

While I understand your analysis, I guess my outlook on this is a bit different. If a company is using these neutral tools to do something you would consider invasive to their customer base, then why would you continue to be a customer if you didn't agree to this? Personally, I don't think anyone really believes that they can be a customer and not have the business track their metrics. By the very fact that I am allowing their app on my phone, that I give it access to my GPS so that I can order food, that I pay with a credit card, etc, they are taking more valuable data from me then I can believe anything they are also getting with split.io and similar things. I can certainly understand people not liking this, but then they shouldn't install the apps, or be customers of these companies in the first place. IF however you do opt to be a customer and use their apps, this is preconditioned that you want the apps to function, which they won't if sites like this are blocked.

I'm sure we could debate this infinitum, and what's worse is we would most likely both be swayed back and forth constantly with each other's arguments for each individual site and the time investment would be enormous. So, I'll cut back to my suggestion from before. Instead of spending 10, 15, 30+ minutes investigating sites like this and laboring over the decision to allow or not, isn't it more efficient to break these out into separate lists? I mean even you are on the borderline on this one and personally, if it's just a telemetry site many people may not be as concerned. I think there is a very very large segment of your user bases that want to block harmful sites, but wouldn't consider sites like this necessarily bad.

Obviously you guys do the lion's share of the work, but it seems a bit authoritarian. As an experienced IT professional, I know I have thrown out solutions that were too strict because there were just too many issues getting them to work. I can't tell you how many people I know (also tech pros) who got rid of similar blocking apps, pi-holes, etc, because it became too much of an effort to track down white-list entries. I think the perfect is the enemy of the good in situations like this.

Anyway, that's just my 2 cents on the issue, again thanks for listening. Turns out I was not the only one having this exact same issue, and luckily I was able to provide the solution over at:

https://github.com/AdAway/AdAway/issues/2103

PerfectSlayer commented 4 years ago

Hi! AdAway developer here =]

@bengalih There is a reason why the app can't log blocked domains when ad blocking is enabled: no request is sent (as blocked) so it will not appear in tcpdump log (the underlying tool the app use).

I don't automatically disable ad blocking when users start log record because some of them complained ad blocking will no more work... (Nobody read the documentation =/)

Another point is Sonic app should not expect split.io to be always accessible. It should fail nicely and support alternative or default behavior. You'd better fill a report in the play store Sonic page rather than finding the right host that will please the app...

About maintaining a second list, I wouldn't recommend that. @jawz101 is already making an amazing work and maintaining a second list requires more time. And why stop to only 2 lists? There will be request by country, religion, drugs, etc... I trust him to choose what's right to be on the list or not and one list is enough. Keep in mind our list is optional in AdAway app.

I'm sorry you lost 2h but it's always worthy when you deal with privacy. Moreover, you might have learn some stuff in the process ;) Otherwise, developing AdAway is loosing 3y of free time to turn them into a bunch of complains and GitHub issues ^^"

As a final word, you might be aware that I started a new version. It will allow you to whitelist app from ad blocking. So you can quickly fix poorly coded app. The log feature will also be improved. The non root mode already log all requests without disabling ad blocking for example.

Big thanks to @StevenBlack also while I'm here ^^"

bengalih commented 4 years ago

Thanks @PerfectSlayer for your reply.

I guess what you say makes sense, since the hosts file is queried there is never any reason for the query to go out over the wire, and I suppose there is really no way to tap into the system's call thought the AdAway app. I admit I did not RTFM on this, but when you say this it makes perfect sense.

I assume your new app design is like the non-root in that it uses that android "local" VPN connection to filter the requests. I prefer to stay away from that methodology, but may reconsider. Are you still reconsidering enhancements to the traditional app? If so, I can open a new ticket over there to discuss some things with you...

I also agree that the Sonic app is misbehaving simply crashing instead of stating "unable to reach split.io" or something like that. I can certainly raise the issue with them, but it is highly unlikely they will do anything about it except say "yeah, you need to allow that site" and at this point (or when a similar case comes up with another app), I've already done the hard part of figuring out which site it is.

I of course recognize all the work you, @jawz101 and @StevenBlack do for this and we are all very grateful, but we of course wouldn't have these github communities if you were not interested in soliciting some feedback and I'm sure (based on projects I've worked on in the past) you all take pride in a product that the user community puts faith in. So my comments are not meant to be ungrateful, but rather to show an opinion of the user base that you might not see as developer. I'm sure for every post like mine there are hundreds if not thousands of other people who might have similar issues, but would never bother to post. I'm here to discuss this in a measured and respectful way explain the issues and just make you guys think about a direction you could take to make things easier for those in the user base who have similar case scenarios.

As such, I believe when it comes to maintaining multiple lists this is not a horrible option. I don't believe asking to split some "iffy" sites into a separate file goes any further than having separate list for gambling/porn/piracy/spyware, etc. I think a "telemetry" list is not so far removed as it would be to create things based on region or religion, etc.

Perhaps, if maintaining a separate list is not an option there could be some standard for annotating current lists to assign categories for each group of entries, or perhaps a confidence score that would allow an app to more easily parse out which entries could automatically be moved to a whitelist (or applied at all). This might seem an large investment to make in terms of overall structure, but it is something that could be moved to over time to make the whole infrastructure of host-based blocking more flexible.

I think you should all consider it a compliment, that when I have app crashes like this I rarely even think of AdAway and @StevenBlack 's lists as the culprit because most of the time it works so perfectly that I don't even consider it...I usually think it has something to do with root in general or simply an OS problem. And of course that you guys are so accessible and willing to listen to community input is something we would never get from the app developers from the Play Store (at least not those from a large corporation).

Again, thanks for listening and considering the options, even if you chose to disregard them in the end.

StevenBlack commented 4 years ago

@jawz101 we block "general telemetry" here.

Look, the Firefox people get their avalanche of telemetry whether our hosts block, or not.

The people we serve don't want to be tracked in any way.

My view: an app that requires telemetry to work is an app whose developers, and their customers, can live with that decision.

jawz101 commented 4 years ago

I don't know what to do. I feel like I've spent more time researching than anyone, given salient reasons it is there- basically ignored- and been told I may have an ego about my stance. @bengalih I respect your input.

I've listed several companies who use this tool and how they use it. For the most part, it's doing one thing to one group of users and seeing if it affects their use of a service. As for customer consent to behavioral influenced purchases and other marketing strategies- well, the whole list is based on us trying to block this stuff. I mean, you consent to visit websites. Why do you block the ads they display to generate revenue?

I look at Split.io as putting lipstick on a pig. Or a wolf in sheep's clothing. It's a tool for do one thing that sounds rational... but all of the actual use cases are for user targeting.

I think we are all the maintainers. It's also a safe bet that everyone in this convo works as an IT professional. As for speaking on behalf of the thousands of people who use this- I'm trying to do that too. As I see it, it's two people (you and the OP) against everyone else who uses the list and have not brought forth issue to these 2 entries.

Should I remove them? It impacts the thousands (maybe millions) of other users who opt to block them? I fully support your conclusion.

bengalih commented 4 years ago

Should I remove them? It impacts the thousands (maybe millions) of other users who opt to block them? I fully support your conclusion.

You're the expert here. You've put the time and effort in. I'm not about to tell you to remove it if you truly believe keeping it is the right call. You're here willing to listen, and I respect that, and therefore respect your ultimate decision. That being said, we could debate endlessly the philosophical and practical applications and needs of the block lists and applications, and I'm more than happy to continue having debate if you find it helpful to discuss other viewpoints. I'm also happy to just let it go since I've said my peace and I feel that you have actually listened and considered.

I'll just close here with a couple of views/retorts to your last statements:

I mean Steven's lists here are a perfect example in how you can opt into which ones you want and which ones you don't...it provides a granularity. I always think that granular/atomic structures are best for development, maintenance, and use, which is why I recommended to consider segregating the lists somehow (be that actual separate lists, or somehow annotating the main list to allow atomic parsing). Is it more work? Maybe yes, maybe no. You certainly wouldn't need to spend so much time investigating a site like split.io, since you can just drop it in a telemetry list (or 2nd tier threat list), and move on. So, as I've said ..just consider those options as you continue on and if a time comes when you need to rework things, maybe this will provide a possible path you can work down.

Again, I don't want to keep beating the horse dead here, and I do feel you've listened to my points and considered them so I don't feel the need to continue re-iterating them beyond this.

Thanks again for all the work you guys put into this, I hate living without them on my phone (although I don't use these lists on my home network due to the endless maintenance required for providing access to things family members need access to :) ).

PerfectSlayer commented 4 years ago

I have a feeling that most people who provide inputs here are heavy users and probably have a tech background.

Yes, and they are mainly people complaining about the current state of things. Can't remember seeing people happy just opening an issue for tell us! (even the term issue is clear of the orientation of the feedback we receive)

I mean Steven's lists here are a perfect example in how you can opt into which ones you want and which ones you don't...it provides a granularity. I always think that granular/atomic structures are best for development, maintenance, and use,

Only the best for users... What development are you talking about in this case? Maintenance script is easier with only one source for example. And it will require less maintenance time.

I mean Steven's lists here are a perfect example in how you can opt into which ones you want and which ones you don't...it provides a granularity.

So stick with Steven list 😄 Everyone has its purpose here. AdAway is an ad blocker Android app first. It proposes a general purpose block list for its users. If you have a more specific need, you are free to tune it the way you like.

It's nice Steven list offers this kind of feature but you can't expect for two guys to maintain a professional grade ad blocking on their free time.

Is modularity better? Certainly. Could you achieve it with 1h / week and maintain it for some years? Not sure.