Open abh opened 7 years ago
OK .. I wasn't aware of that policy. I'll change as you suggest.
Changes made .. all references to hard coded NTP Pool hostnames have been removed.
Thanks for prompt response @abh @gavineadie -- sorry that none of us saw this problem coming here. We probably could've. May be worth a warning to users in the readme to think through the repercussions of issuing (k req/s * d devices)
Hopefully devs at snapchat update/fix asap.
@abh is there anything else problematic in this lib? May be a good time to review/audit it. If people are in contact with snapchat devs and the push other fixes-- important to get those back here upstream to ensure the next mobile/iot mega app doesn't cause other problems for ntp.
Yes .. Let me be frank since not all the history is known. I wrote this library a long time ago for the iPhone 3GS and earlier. At that time, as best I could tell, the iPhone obtained its time from the phone company and, for me in South-East Michigan, it was not unusual for the iPhone time to be at up to two minutes variance from true time. Since my app was predicting the location of the International Space Station and a two minute error represented about 600 miles, I wrote this library so people wouldn't be staring at the wrong side of the sky!
Since then, the world has changed. First, iOS devices use the NTP protocol as a time source -- I stopped using my own library years ago because iPhones were delivering sub-second accurate times natively. Second, my first GitHub contribution was this library and I now consider that a mistake -- it was a fairly immature piece of work (@jbenet improved things significantly, much appreciated). Third, it became clear that this library was mostly being used in a 'one-shot' manner to check the iOS time hadn't been pushed back to defeat software registrations, and there are better ways to do that (in fact, this library had a one-shot capability added). Fourth, I never dreamed an "iOS mega app" would use it -- I would have discouraged that earnestly.
I've thought for a while that this library had passed its "sell-by" date and was of minimal value; this Snapchat incident (about which I've heard only a little, and that obliquely), makes me want to remove it. I've been involved in network software, on and off, for about forty years and regard myself as a good net citizen. I would very much appreciate hearing stakeholders' thoughts on the best (or just good) next steps.
Thanks @gavineadie and @jbenet!
@gavineadie I agree with much of what you wrote, but a silver lining is that the library being open source did give the community a way to find the code and help get it and documentation for future users corrected. :-)
True, but I still feel like an idiot !
I should also say the library will query NO servers in its new default state .. now a ntp.hosts
file MUST be provided.
@gavineadie ❤️
i think we could've done more (we can always do more). We could've known about that recommendation and fixed our code, we could've documented this better, and so on. But i think the main issue in this incident is deployers not understanding what they're deploying. This should've been caught by snapchat's development & deployment team way before making it into production in millions of devices (if it hit that, i still have little knowledge on the numbers).
I personally think that this library being here helps everyone. Open source is tricky, but we are all better by this library being here, than it not being here at all. It allows better community review and fixes, and i count our involvement here (however small) as a net win (@abh and the NTP folks were able to track the problem down and point to the code, @gavineadie was able to fix it quickly, and snapchat was able to pull latest (or apply their own patch), thanks to the code being Open Source and readily available. I wouldn't want to see it go.
(sorry, had to step out for a bit -- i wanted to add the following)
Second, my first GitHub contribution was this library and I now consider that a mistake -- it was a fairly immature piece of work
Hmm-- @gavineadie i am personally very thankful that you made this implementation in the first place, and that it worked well; it solved my problem and allowed me to make a game that brought joy to 100,000s of people. (where they couldn't cheat by changing their phone's clock!) And i'm sure so many other users along the years have also been very thankful for it-- it fit many of our needs. Your work is much appreciated, so please feel very welcome to continue your contributions, for the commons can only get better when we work openly. ❤️
Fair enough, all good points ..
I'll note, for what it's worth, that NTPv4 RFC makes no mention of not using the "pool" hosts, so due diligence was done there. Only today did I become aware of the draft "best practices" document (now included in the repo).
I'm reminded that years ago I read the SMTP standard(s) carefully when doing a mainframe implementation; I even used the BNF grammar in the document appendix to parse the protocol. When I got it working, I found the distressing extent of the violations of the protocol in the wild .. my first experience of "be generous in what you receive, and rigorous in what you transmit" !!
I could add too, sometimes it's very important to make sure all devices in your ecosystem as synced to a singular time source! Across hardware manufacturers... operating systems... different leap second handling implementations. Now I can be certain that every user of app whether iOS, Android (on a whole host of different hardware), plus my RedHat servers are all synced to the same time source.
Please don't hard code NTP Pool (pool.ntp.org) hostnames. It appears one of your users had a bad implementation and caused considerable havoc on the system. Leave the list empty and reference http://www.pool.ntp.org/vendors.html
https://twitter.com/paulgear1/status/810967233375531008