phusion / passenger

A fast and robust web server and application server for Ruby, Python and Node.js
https://www.phusionpassenger.com/
MIT License
4.99k stars 548 forks source link

Migrating Passenger from C++ to Go #2118

Open FloorD opened 5 years ago

FloorD commented 5 years ago

Today Passenger is mostly written in C++. Back when Passenger was created, C and C++ were the only viable options to offer Apache and Nginx integration, ease of installation (users commonly have a C/C++ compiler installed), and performance.

But the programming language ecosystem has evolved since then. Compared to alternatives, C++ is limiting our development velocity. We've been considering migrating Passenger to Golang. @FooBarWidget elaborated on this in a post on the Phusion blog: https://blog.phusion.nl/2018/09/18/migrating-passenger-from-cxx-to-go/

... and we need to hear from you, when you compile Passenger from source (if you use our binaries users there will be no change in the way Passenger is installed or used). If we adopt Go, then you'll need to have a sufficiently recent Go compiler installed (we're targeting version 1.11). Would that be at all acceptable to you?

Please let us know in the comments 💬 to this issue! 🙏

schneems commented 5 years ago

Will gems ship with pre-compiled extensions or will they need to install from scratch. I.e. when deploying on Heroku will we need to make a Go runtime available or will it work on current infrastructure?

FooBarWidget commented 5 years ago

@schneems Short answer: no Go runtime needed on Heroku.

Long answer: Passenger consists roughly of 3 components:

  1. A web server module.
  2. An agent process.
  3. A Ruby native extension, to be loaded inside Ruby.

Components 1 and 3 are not affected by the Go migration; they will remain as they are. Only component 2 is, but we will supply precompiled Heroku-compatible binaries for that component.

ayang64 commented 5 years ago

My only suggestion is to avoid fasthttp at all costs. Justify using something other than net/http with benchmarks and when you do, ensure that you've optimized everything else.

fasthttp isn't well maintained, had, until recently, long standing bugs and may unnecessarily complicate your code.

hone commented 5 years ago

@FooBarWidget that blog post is pretty fantastic.

As an end user I ultimately don't care much what you use under the hood as long as I don't personally need to do extra work. How many of your users compile from source?

It sounds like for technical reasons you're pretty set on Go. One caveat to be aware of (which may or may not affect you) is the biggest reasons we dropped Go from the Heroku CLI was needing to recompile binaries when new OS X versions dropped. I can put your in touch with people if you want to dig into it more.

Though Go is not my favorite language by any means, the blog post outline makes a bunch of sense. IMO Go's biggest strength is onboarding/learning. Just doing the Tour of Go is usually enough to be a little dangerous. I believe this is one of the drivers behind it's adoption among performance and other things.

Re: Rust, I'd agree the ecosystem is not quite there yet. Tokio still seems to be in pretty active development, but stuff is starting to stabilize? It works on stable rust at least. Might be worth reaching out to @wycats on the topic.

Re: Java, have you looked at all at the GraalVM? I believe they've posted some impressive numbers on boot time with it.

Best of luck!

FooBarWidget commented 5 years ago

@hone I would love to get in touch with those people regarding the need to recompile binaries for newer OS X versions. This is the first time I've heard of this.

hone commented 5 years ago

@FooBarWidget @jdxcode is our lead engineer on the CLI who had to deal with the Go/OS X issues.

jdx commented 5 years ago

to be clear, it wasn't that we needed to recompile exactly but that we encountered serious new bugs on 3 different macos releases with existing binaries. Recompiling wasn't sufficient to fix them though, there were actual bug fixes that had to land. We dropped go a couple of years ago though and I can't speak to it since then, but it was an awful experience maintaining a CLI written in Go.

Cross compiling also bit us a few times where we were linking to things on our macos box that would then break on linux (like getting the home directory using standard go core).

I've found packaging Go to be much more difficult than in other languages as well. You pretty much need to build tarballs by hand instead of putting assets in a gem or npm package through standard conventions. You end up with a ton of boilerplate scripts that are hard to maintain and others to grok.

Note that I use Go currently for a couple of services with good success, this isn't a complaint about the language, it's more about distribution/release problems we ran into being a tool run client-side.

YMMV of course, but this was my experience.

CamJN commented 5 years ago

Go's biggest problem on macOS is here: https://github.com/golang/go/issues/17490. Apple will and does break things for go all the time, and binaries are not 100% portable.

thoughtafter commented 5 years ago

I am also writing a project in C/C++ and have had similar concerns. I would caution against moving to fast, a rewrite can be very time intensive and you can experience unexpected problems. I'm not sold on go as a replacement for a C/C++ code base but ultimately you have to choose what you think is best for future development and maintenance.

Specifically related to the question of having a go compiler I think more details would be useful. I have passenger deployed mostly on Ubuntu and mostly 16.04 and 18.04. However, deploying passenger on 14.04 the easiest go version to install is 1.2.1. Are you suggesting more recent than that? What is the minimum version you would like to support? I would then use than to canvas various distributions to see which will require effort beyond a default package installation.

Take your time.

jdx commented 5 years ago

I realize this may not matter much for this project, but we've been shipping node binaries in our CLI for 4 years now and not had a single issue with the binaries after macos releases. I'm not sure how exactly v8/node system calls work differently than in node (I suppose that's the thing, I don't have to), but I wouldn't be surprised if it's what @CamJN mentioned.

FooBarWidget commented 5 years ago

@thoughtafter We would like to use Go 1.11 (the latest version right now), but preferably a future version for generics support. This would make installation for existing distributions a bit troublesome, but we do supply binary packages for Ubuntu.

CamJN commented 5 years ago

@jdxcode Node is written in c++ so it just links to apple's provided system call library, which provides a stable interface. Go implements its own library, which directly calls into the kernel. That API is unstable.

defunkydrummer commented 5 years ago

I'm amazed that your hand-picked alternatives are only two: Rust vs Go. First, for a high-performance language for writing a server there are many mature alternatives. Not only that: I wouldn't consider Rust or Go mature, both of them having only one (1) major compiler and no standardization. You mention concern over the lack of C++ developers, but for both Go and Rust the developer pool is even smaller than C++.

If it's a C++ codebase, #1 alternative ought to be Pascal (modern Free Object Pascal), which has none of the problems mentioned for C++, and better performance than Go, not to mention a drastically lower memory usage than Go, Java or any of the GC languages. Not to mention that it's far easier to learn. The ecosystem is fully mature, the tools are out there, the FPC compiler works perfectly, and it's a very small download.

If i wanted "high performance" understood not only as fast execution but also low memory usage, i would stay clear of the GC languages. And, mind-you, my fave language is a garbage collected language!

mfcastellani commented 5 years ago

You really should take a deep look at Rust.

mlh758 commented 5 years ago

@defunkydrummer I thought Go had the standard compiler and a plugin for gcc? Although looking at my go environment it seems like the current compiler is the gcc compiler so I might be misunderstanding. Also, how broad is Pascal use? This isn't a dig, I just haven't heard much about the language since school. It's still pretty high on the TIOBE index though so it may just not be in use in my area.

More to the topic of the issue, at work my team uses Passenger for several applications and you provide binaries for the Linux distributions we deploy to so it shouldn't impact us. We are also moving towards more containerized deployments and building a binary from an intermediate container isn't too bad of an extra step so it still wouldn't hurt too bad if we had to build from source since there are build images available that provide the go runtime.

I've had good luck with the couple of Go applications I've created for work, but my frame of reference on performance is Ruby so take that with a grain of salt. I will say I've had a very easy time teaching team members how to write Go code. New people have an easy time picking it up and becoming productive and the small language usually means you can look at someone else's code and generally figure out what is going on.

My experience with Rust is much more limited - I'm new to the language and learning it as a curiosity. My experience so far though is that it is hard for the right reasons. The language is pretty easy to grok, it's just hard to get it to compile and you have to think more about what you're writing than is typical. The trade off is that once it does compile you can be more confident in what you wrote. It also supports generics so you wouldn't have to worry about migrating to a newer version of the language once that feature became available if it is something you need (interfaces suffice comfortably enough in my projects). It's also not garbage collected which may be a win for your needs and the intent of the language seems to be targeted more to your use case.

These are both new languages though, I'd be a little worried about them falling back out of fashion and not getting the developer base you are expecting. There are large projects consuming both, so that risk is reduced, but it's hard to compete with the decades of use C++ has built up over the years. It's still taught in a lot of colleges so many people are at least passingly familiar with it.

PikachuEXE commented 5 years ago

For me I rather learn (relearn?) C++ / C which can be used for Nginx other libraries The thing stopping me from understanding the code is documentation (although lack of C++ knowledge might prevent me to read the doc?)

theckman commented 5 years ago

I ended up here from the blog post, and thought of a concern of switching to Go. Full disclosure Go is my "weapon of choice", so I have my biases towards using it. However, Passenger isn't in the tech stack I directly work in so this choice doesn't directly impact me.

That said, when making the move we should consider whether everyone (or enough of the people) run on an OS well-supported by the Go compiler. A few years back in the Chef community there were some discussions about moving to Go for some things, and one of the major sticking points was the lack of OS/architecture coverage compared to Ruby. Does the same concern exist with Passenger?

theckman commented 5 years ago

@thoughtafter You don't need Go installed on a server to run a Go binary. For Linux they are standard ELF binaries that can be executed on their own. So as long as you use one of the pre-built binaries, you don't need to install Go. Also don't install Go through the package manager. 😛

marius commented 5 years ago

It may be a bit late for your evaluation of Go vs Rust but this is hot off the press and lists a few valid points: http://dtrace.org/blogs/bmc/2018/09/18/falling-in-love-with-rust/ Should you get bitten by Go's M:N threading model don't say you haven't been warned. :-)

tinco commented 5 years ago

Any language that does not have an event loop built in will have the problem of multiple alternatives being available. Rust is moving towards a standardized interface for futures (Rust nomenclature for Promises), so once that hits at least your code can be decoupled from a specific event loop. That said it definitely is one aspect of Rust that is still in development, so if stability really is the main concern then I agree Go looks more attractive.

Why wait for generics in Go? If (when?) that gets in, it'll also be new and experimental, so that defeats your stability argument. There will be zero stable libraries in the Go ecosystem that support generics. Also, I feel that if a project requires generics, then it probably isn't the best fit for Go. If you want generics, why not go for AOT C#? That's has a stable specification, a huge active community and generics. (I'm not seriously recommending C# btw, just use Go...)

philayres commented 5 years ago

I'll be honest and say that I know nothing about Go or Rust. I have been using Passenger for Ruby on Rails apps for a long time; many different apps on many different platforms. I've seen how memory hungry JVM services can be and it makes me worry.

A big reason I have stayed with Passenger has been its availability on AWS Elastic Beanstalk. This has allowed me a consistent package to work with across several platforms (from Centos 6 - ugh, Ubuntu and AWS). So, the reason I'm writing this comment...

The memory footprint of Passenger after migration will need to fit into the miniscule memory of an AWS EC2 t2-micro for a reasonable sized app. If I suddenly need a couple of Gig of memory to run the apps, it won't work for me. Or possibly a lot of die hard Passenger users who like using free-tier Amazon instances to get started with projects, or to scale loads of instances horizontally.

Good luck with your decision.

dansouza commented 5 years ago

Regarding the blogpost at https://blog.phusion.nl/2018/09/18/migrating-passenger-from-cxx-to-go/

Go's I/O model fundamentally requires more memory for idle connections than our current evented C++ server. This is because the evented C++ server does not allocate buffers until there is data to be read over the socket. If we have many idle HTTP connections (e.g. waiting for keep-alive, or waiting for new WebSocket data) then Go will use a lot more memory than our C++ server.

This problem cannot be solved without intervention from the Go authors. But whether this is a real problem in practice for Passenger users remains to be seen. We have not heard of anybody using Go in production to complain about this issue. If you have experience on this subject, please share your thoughts with us.

I remember seeing a trick to deal with this, but I'm fuzzy on the details: basically you try to read just a single byte, then once that read returns, you then allocate your full 4K buffer, put the byte you read on it, then read the rest of the incoming stream from the socket into the rest of the buffer. This way every idle connection only uses 1 byte. A bit more work and two syscalls instead of one, but hey!

patleb commented 5 years ago

I've digged a little bit in the source code to understand where the ouptut from passenger-status --show=[option] was comming from and how it was assembled. So I don't really grasp what the 'agent process' does and I would like to know, @FooBarWidget, if it's responsible for executing some stuff on each request or it's merely just acting as some kind of watchdog/signal handler? Because if it's the former, than I would foresee a decrease in performance depending on how many calls to the C/C++ bindings the agent process has to make (which could be negligible if it's minimal).

FooBarWidget commented 5 years ago

@patdowney The agent is the bulk of Passenger, so it's the former. It's where the main HTTP server is.

patleb commented 5 years ago

That's unfortunate... even though I exclusively use Ruby with Passenger, I have some very simple optimized Rails requests that are around 1ms and the C bindings could easily double that time (assuming that CGo will integrate with the Nginx module and the Ruby extension). Although, I wouldn't worry too much for requests over 50ms which might be what you're targeting, but I would definitely take into consideration that CGo is worth it when compute time is significantly greater than the C calls time.

I don't mean to criticize the choice of language (it's more like a warning), I'm all in for anything that makes your life easier, but when I went to try replace some part of Ruby with Go through an extension (maybe 2 years ago), it happened that it was in par with not having an extension at all (very disappointing). After investigation, the compute time was signicantly smaller with Go, but it was the bindings that made it irrelevant. Basically, I went back to C++ using rbplusplus and all was as good as I expected and since it was well-integrated with Ruby I could simplify some parts by adding a precompile steps processing ERB within the code so I could share/simplify some code with Ruby.

Anyway, I hope that Go won't impact Passenger too much and just brings good stuff, because other than C/C++/Rust I don't see (or know of) other low-level alternatives.

yaslam100 commented 5 years ago

Most java concerns mentioned in the article can be solved using GraalVM

joshbaptiste commented 5 years ago

Should have a look at Nim statically typed, compiles to C, Generics, Optional GC etc.. started around the same time as Go ~2008 with some brilliant folks behind it and a rich standard library.

tegk commented 5 years ago

I think Go would be a good fit for your needs. Do not use Fast HTTP though as we have seen no advantage in production.

brandondrew commented 5 years ago

Please consider using Pony: https://www.ponylang.io/

Pros:

Cons:

justinclift commented 5 years ago

@philayres As a general data point, Go seems to be very efficient memory-wise for server apps. As one example (and not well written code :wink:), DBHub.io generally keeps up for months at a time on a low end Scaleway server and uses about 30MB of ram. :smile:

And yeah, it's super low usage (basically just sits there). Things like Gitea - a GitHub clone written in Go - run effectively on a Raspberry Pi, unlike (say) GitLab which is generally a resource pig.

With Go on Windows and OSX until um... I think it was Go 1.9 or 1.10 the debugging story wasn't great. While the applications themselves were generally ok, there were problems with the way symbol information was stored in the debugging-enabled binaries (this is from memory, so details are fuzzy), so some things just didn't work right.

Personally I moved my OSX desktop to Linux due to it (which solved the problems) so haven't kept an eye on the OSX/Windows state of things. In theory (!) they should be much improved now, but someone using those platforms actively for Go dev would be better to comment, just in case... :wink:

If most everyone in your dev Community uses Linux for development though, that's a non-issue.

jcarres-mdsol commented 5 years ago

About Rust VS Go. It's true that Rust is changing the futures (promises) async right now, but that's a late-2018 thing, they are about the release with stable async and await keywords in the language (I believe by Christmas?) and once that's done, you could consider that as stable as Go (or more, isn't Go 2 coming?)

I also think most popular libraries are stable in both languages. You can make an argument for Go on the size of ecosystem or size of engineer pool you can find out there. Go is definitely more popular.

Performance wise both languages allows you to engineer very high performance, although from random benchmarks around I've always seen Rust performing better by default. According to these benchmarks, Go is slower than C++ in heavy mem and cpu microbenchmarks though. Not sure if you care about that particular aspect too much. With a lot of IO probably both behave similarly.

brandondrew commented 5 years ago

Another option to consider is Crystal.

If I were rewriting Passenger, I would rewrite as much as I could in Crystal, and do anything with tricky concurrency in Pony. Remember, even if Pony looks initially intimidating, it was designed to make certain kinds of things fairly easy that are nearly impossible to do without bugs in C/C++. Learning it might be harder than some languages, but producing good software—when that software hits Pony's sweet spot—should be much easier (after you climb the learning curve).

mersinvald commented 5 years ago

Would advocate for Rust too:

Pros:

Cons:

zhufenggood commented 5 years ago

If migrating to go, maybe have chance to contribute code.