Richard Jones | Rewriting Playdar: C++ to Erlang, massive savings

RJ commented 3 years ago

Written on 02/10/2011 17:40:18

URL: http://www.metabrew.com/article/rewriting-playdar-c-to-erlang-massive-savings

RJ commented 3 years ago

Comment written by Guest on 10/21/2009 22:45:42

Heretic!
Now I finally have no excuse to get into Erlang and start contributing. ;-)

RJ commented 3 years ago

Comment written by Steven Gravell on 10/21/2009 23:21:58

The few days I spent learning C++ and tinkering with Playdar was all for nothing... nothing!

~75% is definitely way more than I was expecting after you mentioned this post in the pub.

Norman, your job is now to reimplement everything you've every done in C++ to Erlang. Ready?... Go.

RJ commented 3 years ago

Comment written by Ryan Richards on 10/22/2009 00:14:23

Now i can feel like my money wasn't wasted on the erlang book since everyone and their dog is talking about scala.

RJ commented 3 years ago

Comment written by Erik Frey on 10/22/2009 01:45:36

But... but...

I was curious to see if erlang had some kind of support for binding to c libraries, for how you got taglib on erlang, but you're just spawning a process and talking on stdio! That's cheating! :p

That's about the biggest nit I can find to pick. And I have a C++ server that's using python right now by running ./server myfifo so I can't really say anything :)

Pretty awesome. I guess I'm with Norman, a big pain in the ass VM was my last excuse. I'm out of excuses!

RJ commented 3 years ago

Comment written by Erik Frey on 10/22/2009 01:47:59

Wah-hey it broke my code, let's try html-escaping:

./server < myfifo | script.py > myfifo

RJ commented 3 years ago

Comment written by James on 10/22/2009 04:28:42

It probably doesn't help that I don't know Erlang, but I don't understand why any time I read other people's Erlang code I see all these one letter variable names and weird, inconsistent capitalization. E.g.:

handle_info({udp, _Sock, {A,B,C,D}=Ip, _InPortNo, Packet}, State) ->
?LOG(debug, "received msg: ~s", [Packet]),
{struct, L} = mochijson2:decode(Packet),
case proplists:get_value(<>,L) of
<> ->
Qid = proplists:get_value(<>, L),
case resolver:qid2pid(Qid) of
Qpid when is_pid(Qpid) ->
{struct, L2} = proplists:get_value(<>, L),

What is all this A,B,C,D, Qid, L, L2, etc.? And why is your C++ code so vertically spaced out in some places?

unsigned short port = DEFAULT_LAN_PORT;
string ip;
if( v.type() == str_type )
{
ip = v.get_str();
}
else if( v.type() != array_type )
{
continue;
}
else
{

This could as easily and as (more?) readably be written

unsigned short port = DEFAULT_LAN_PORT;
string ip;
if( v.type() == str_type ) { ip = v.get_str(); }
else if( v.type() != array_type ) { continue; }
else {

Voila! 50% savings, C++ versus . . . C++.

By the way, I'm not trying to knock your choice in language by any means. From everything I've heard, Erlang is a great language. And there may very well be a difference between it and C++ in terms of conciseness. But when I see all these line count comparisons sometimes I get a sense that it's less that one language offers more conciseness than another and more that the author _wants_ it to, and thus codes differently such that the conclusion they wish to draw is supported.

RJ commented 3 years ago

Comment written by evgen on 10/22/2009 04:33:15

If you want to interface to external code there are three options: ports (stdin/stdout piping) which are dead simple and prevent a crash in your external code from taking down the Erlang VM at the cost of some speed due to serialization/de-serialization, linked-in drivers which can load up DLLs and shared libraries and make their functions available with none of the port overhead at the risk of a null pointer ref or some other bug in the library taking down the whole Erlang VM, or nodes that interface for a specific language. The last option is basically a process running in language X (C, Python [including one option that uses the Twisted event loop], Ruby, etc.) that knows how to speak the erlang node protocol and can basically pretend to be another distributed node in the system. This option is less well known than the other two but is often a good one to look at; you get somewhat faster data transfer by only needing to convert data structures to something specific for your preferred language when/if you actually need the data and you can call specific functions across the node boundary (e.g. call an Erlang function from Python or call a C function from Erlang.)

RJ commented 3 years ago

Comment written by RJ on 10/22/2009 09:55:20

The reason the C++ code exists and is run as a separate process (for taglib) is because that's one of the three Erlang ways to integrate with external code. It's the simplest and cleanest way. evgen covered the three ways in his comment above. I'd actually claim that as one of the great things about Erlang - it's easy to interface with external code in a standard, supported way that makes the external code look like an Erlang process (Ports).

Regarding the SLOCcount for the LAN plugin, i adjusted the C++ linecount down when collecting these stats because I didn't implement the PING/PONG stuff in the Erlang code. (ie, i removed that code from C++ then counted the lines). So I still think it's a reasonable comparison.

I'll admit the style/newline proliferation in some of the C++ code will have inflated the line-count a little, and it could certainly be written with less newlines (and less readability, some would say), but we're still in the right ballpark.

Playdar is often network/IO bound, but it also does a lot concurrently with plugins doing things in parallel then notifying the main resolver when they find something. Erlang style concurrency is perfect for this.

RJ commented 3 years ago

Comment written by RobW on 10/24/2009 14:41:04

@James

Concerning "inconsistent" capitalization, Erlang is 100% consistent with capitalization. It is *enforced*. In Erlang, variables are *always* capitalized, whereas "atoms" are *always* lower-case. An atom (like anything in the world that is *truly* an atom) is something that is meant to be indivisible: you can't reduce it. An atom is like a variable name where you use the name itself, there is no value associated with the name.

Concerning "L" versus "L2", Erlang is a single-assignment language. These are variables, since they are capitalized, but the naming is used to show versioning of variables (e.g. making change explicit). Within the same *scope*, once a value is bound to a variable, the variable cannot be reassigned. This design philosophy is meant to eliminate whole categories of programming errors, which is important since *reliability* is Erlang's primary goal. With multiple-assignment that most languages use, it's almost as if you have to track the state of variables in addition to the state of objects, because the same name can be bound to different values at different times within the same scope. Erlang's need for distributed programming in order to allow fail-over and similar features requires reliable concurrency. Reliable concurrency can't happen if you have to track a lot of messy state. Therefore, at every opportunity, Erlang tries to be as stateless as possible. Only each process as a whole has state by continuously passing its variables back to itself via a recursive function that acts as a main loop (it doesn't run out of stack/memory due to tail-call optimization being required).

Concerning "A", "B", "C", and "D", it looks like the author is pattern-matching in order to assign values to these, so that if you pass into the function an IP address of "127.0.0.1", the result is: A=127, B=0, C=0, D=1. Since you didn't include the entire function code, I don't see where these variables are used unless I dig into the source myself.

RJ commented 3 years ago

Comment written by Henning Diedrich on 12/23/2009 06:22:35

"I’ve used processes to encapsulate state (active queries, specifically) where I didn’t really need to. It seemed sensible at the time ..." -- what are you using now instead? Ets or Mnesia? The OO/Actor equation seems to encourage the encapsulation of state in processes. After the experience you had there, any suggestions along what lines to think one's way out of that? Back to separation of instructions and data - half way? I come to think that Mnesia is more integral than it looks at first glance. Even though it 'feels' like too big for being the standard way of state handling, without its transactions something is missing. Ets are not sufficient. The abolition of locks and synchs may simply requiring for transactions in common state handling or it's merely a truncation of applicability where shared state is part of the requirements?

RJ commented 3 years ago

Comment written by Best Green Hosting on 02/15/2011 09:42:04

Great article, one of the best in recent times!!..

http://infowick.com/service...

RJ commented 3 years ago

Comment written by fettemama on 02/16/2011 12:15:36

nice hipster hat bro

RJ commented 3 years ago

Comment written by pcunite on 08/20/2013 23:40:15

I rewrote a C++ app to C++ and saved 75%. You like Erlang
... that's cool. The second copy of any logic is going to be better.

RJ commented 3 years ago

Comment written by Peter Marreck on 05/19/2015 17:05:18

If you look at rosettacode.org and browse the various language implementations for various algorithms, I think you will start to notice something- The functional language implementations are usually significantly smaller/more concise than the procedural/OO implementations. I don't think this is a coincidence, and I think it is indicative of something important and fundamentally different.

RJ commented 3 years ago

Comment written by Peter Marreck on 05/19/2015 17:06:30

Take a look at http://elixir-lang.org/. It's like Erlang with a fresh coat of syntax paint. ;)

RJ / www.metabrew.com

Richard Jones | Rewriting Playdar: C++ to Erlang, massive savings | Richard Jones, Esq. #15