vacp2p / nim-libp2p

libp2p implementation in Nim
https://vacp2p.github.io/nim-libp2p/docs/
MIT License
246 stars 54 forks source link

Expose memory and resource tracking facilities from libp2p. #145

Open dryajov opened 4 years ago

dryajov commented 4 years ago

With #131, libp2p is tracking memory and other resource usage internally, using both nim-metrics and chronos tracking - this needs to be rethought. Preferably, libp2p itself would expose hooks and counters that other applications would use to track resource usage. This is further explained in https://github.com/status-im/nim-libp2p/pull/131#pullrequestreview-396495918.

sinkingsugar commented 4 years ago

I agree completely, also they have a cost kinda too at runtime. They should be opt-out-able.

stefantalpalaru commented 4 years ago

Keep in mind that the nim-metrics integration is opt-in and has no overhead by default.

sinkingsugar commented 4 years ago

Keep in mind that the nim-metrics integration is opt-in and has no overhead by default.

Yeah I mostly meant the chronos ones, using tables, hashing and refs. (Should be small impact tho anyway for now)

zah commented 4 years ago

One of the ideas we had for nim-metrics is to support compile-time configuration specifying which metrics are enabled/disabled. The idea is that a library like libp2p should never face the dilemma whether to add another metric, because adding a metric is always completely free for the users of the library. They have to enable the specific metrics of interest with a compile-time flag (e.g. by specific a list of metrics or a list of chronicles-like topics).

dryajov commented 4 years ago

One of the ideas we had for nim-metrics is to support compile-time configuration specifying which metrics are enabled/disabled. The idea is that a library like libp2p should never face the dilemma whether to add another metric, because adding a metric is always completely free for the users of the library. They have to enable the specific metrics of interest with a compile-time flag (e.g. by specific a list of metrics or a list of chronicles-like topics).

I'm not sure how that would look in practice and sounds like this would couple it to nim-metrics even more, while the current goal is actually to remove it as a dependency and allow the user to choose which reporting mechanism suits their use case the best.

Also, the metrics have a dual purpose, for example any sort of connection management would require keeping track of the number of connections, how fast the link is, how long it has been opened, etc... This same information can be exposed through APIs that are consumed by some reporting mechanism. So I like to think about this as two components, the actual accounting mechanism and an API that exposes this metrics to both, different parts of libp2p as well as external consumers.

Perhaps, it's a mistake to talk about this two distinct use cases in the same issue, but the underlying mechanism seems to overlap enough to be able to treat them very much the same.

Let me know if I misinterpreted what you're proposing @zah.

zah commented 4 years ago

Why would you want to avoid nim-metrics as a dependency?

As I said, it's supposed to be a comprehensive solution giving more power to the users. If you use an internal solution, you are very unlikely to implement the compile-time controls and you'll be facing the dilemma that I mentioned (every metric comes with a certain cost, so you'll have to decide if including a certain metric is reasonable for all users).

For the specific vision of combining the internal accounting mechanisms with the published metrics, there are indeed different ways to approach the problem. One easy way would be to use normal variables that are updated through functions that also update a corresponding optional metric. Directly using the metrics counters is also quite reasonable, although you'll have to write some code that checks that the user haven't accidentally disabled the particular metric in their build.

dryajov commented 4 years ago

Because I'd rather keep libp2p standalone and with as little dependencies as possible. nim-metrics will add little in making libp2p achieve is main goal, which is to be a cross platform, p2p networking stack, this includes browsers (wasm and/or js). Adding nim-metrics would complicate portability and also increase it's overall footprint. A common API that client applications consume, on the other hand, allows consumers to of the library to choose whatever metrics/reporting solution suits their needs best.

supposed to be a comprehensive solution giving more power to the users

What's comprehensive for you might not be for others, this is a subjecting statement. I'd much rather to leave this kind of things up to the user to decide.

If you use an internal solution, you are very unlikely to implement the compile-time controls and you'll be facing the dilemma that I mentioned (every metric comes with a certain cost, so you'll have to decide if including a certain metric is reasonable for all users).

That's a pretty arbitrary statement - why would I forget if it's a core requirement and if it isn't why would I care?

One easy way would be to use normal variables that are updated through functions that also update a corresponding optional metric.

This sounds like it would increase the memory footprint for no reason.

Directly using the metrics counters is also quite reasonable, although you'll have to write some code that checks that the user haven't accidentally disabled the particular metric in their build.

So, for example, disabling connection accounting for nim-metrics, would suddenly disable connection limits in libp2p?

I appreciate the work put into nim-metrics and I think it's going in the right direction, but I simply don't see the need to couple it with what aims to be a standalone library.

zah commented 4 years ago

You didn't address my main point. A library faces hard time selecting suitable metrics for its users, because all metrics have cost, but different users need/want different metrics at different times.

Implementing such a configurable metrics library takes quite a lot of effort and since the metrics are not critical for the main goals as you've stated, it's very unlikely that you'll ever have the time or inclination to implement similar capabilities in LibP2P. Hence, the end result will be that by not adopting nim-metrics, LibP2P is likely to be less capable.

I think part of the misunderstanding is that we are talking about different things here. If for example, LibP2P has traffic throttling options, peer count limits and so on as part of the settings specified by the user when creating a Switch, then by all means these should exist as regular variables in the code. The metrics I care about are various health and performance metrics that the users might be interested in monitoring. The more such metrics we have, the better, but the dilemma I presented above is an obstacle for adding them.

The portability is not a problem, because the metrics library has compilation modes where it completely disappears from the final code. I'm sure it would be easy to add browser specific back-ends as well.

Using a metrics library is similar to using a logging library. You don't want to build that in every project.

sinkingsugar commented 4 years ago

Using a metrics library is similar to using a logging library. You don't want to build that in every project.

It is also true that potential users have their own logging and metrics library quite likely so enforcing our "brand" is hardly flexible.

zah commented 4 years ago

You are applying common wisdom from C++ and other languages to Nim. This common wisdom is no longer valid, because in Nim we have more powerful compile-time code generation capabilities.

In the common-wisdom approach, the typical way to support pluggable logging framework is through an interface like ILoggerthat takes string messages. This means that the code for preparing the string messages is always compiled as part of the library and the logging format options are limited to few enumerated choices offered by the library.

With the approach of Chronicles, all aspects of the logging code can be modified at compile-time. You can decide to erase all logging code, you can choose any kind of output format and you can adapt the logging statements to any kind of alternative logger. That's because Chronicles offers compile-time mechanisms that allow you to modify the generated code.

The logging statements and the metrics updates are supposed to just capture the intent of the library author regarding what useful information can be logged and measured. The compile-time configuration offered by these libraries allows the user to pick the right subset of the offered information and to adapt it to their specific desired output formats and back-ends. That's the ultimate flexibility.

sinkingsugar commented 4 years ago

The ILogger might apply to C# that has no meta-programming, but in modern C++ I assure you a chronicles like library is very possible (and they do exist actually) with all the needed compile time magic.

But anyway what I mean is users likely have their own software stack and likely try to minimize dependencies. I suppose this is also what @dryajov means.

If you import libp2p you wanna import just libp2p not a dozen of hidden abstractions and dependencies + libp2p.

The way rust does this is with features which in turn will enable other dependencies optionally. If we had such an option I would rest my case, but without such option I agree with @dryajov

p.s. config.nims and --define are nearly almost like features of rust but not quite clear, defined and standardized. Specifically you cannot import configs.. etc

zah commented 4 years ago

I've said it already. By refusing to accept dependencies, you are making the library less capable. Can you argue with that?

I guess I'm lucky because Chronicles is already well adopted and nim-metrics to some extent too :)

stefantalpalaru commented 4 years ago

If you import libp2p you wanna import just libp2p not a dozen of hidden abstractions and dependencies + libp2p.

Too late for that: https://travis-ci.org/github/status-im/nim-libp2p/jobs/678438057#L282

So the libp2p dependencies so far are: nimcrypto, bearssl, chronicles, json_serialization, serialization, faststreams, stew, chronos, testutils, secp256k1 and metrics. And the problem is metrics?

zah commented 4 years ago

With a proper package manager, the number of dependencies doesn't matter that much. What matters is how much code ends up in the final binary.

dryajov commented 4 years ago

The ILogger might apply to C# that has no meta-programming, but in modern C++ I assure you a chronicles like library is very possible (and they do exist actually) with all the needed compile time magic.

Logging in C/C++ has usually been done with macros and defines, which is compile time, not sure how modern C++ handles it now, but similar approaches to chronicles have existed in C/C++ for a long time. So I agree with @sinkingsugar.

Also, the point of ILogger and alike isn't compile or runtime so much as to define a standard interface that decouples the logging backend from the API. In fact, ILogger or similar would benefit Nim as well, we can then switch backends freely.

I guess I'm lucky because Chronicles is already well adopted and nim-metrics to some extent too :)

Well, actually if I would do it all over again, I would probably adopt an approach similar to ILogger and allow for pluggable backends.


As a general remark, I actually think that trying to create a solution that suits everybody's needs is detrimental to a project, specially with things like logging, metrics, reporting, etc... Logging and metrics specially, are super infrastructure dependant and certain industries have to comply with specific regulations (healthcare, banking, etc...) that prevents them from using an off the shelf solution, so the more freedom they have to plug their own mechanisms the more like they are to adopt a project/library.

So in other words, as with everything, there should always be an escape hatch, otherwise a lot of potential users are going to be turned off.

dryajov commented 4 years ago

I've said it already. By refusing to accept dependencies, you are making the library less capable. Can you argue with that?

Not really, the point is not to do any sort of "vendor lock-in". If it's possible to provide the same functionality without tying to any specific dependency, then that is preferable to shipping everything as one package. Ofcourse, as with everything - YMMV.

dryajov commented 4 years ago

With a proper package manager, the number of dependencies doesn't matter that much. What matters is how much code ends up in the final binary.

It matters and a lot, specially when it comes to security and stability. The more external packages you depend on the more vulnerable you are to possible hijacking attacks (https://www.theregister.co.uk/2018/11/26/npm_repo_bitcoin_stealer/) and catastrophic (or not so much - https://blog.npmjs.org/post/141577284765/kik-left-pad-and-npm) infrastructure failures. So as much as I love package managers, they don't remove all the risks associated with external dependencies.

zah commented 4 years ago

Chronicles already support the equivalent of anything you can do with an ILogger interface. Please provide any challenge and I'll explain how it's done in Chronicles.

In fact, Chronicles is the "standard interface" you are referring to here. It decouples the back-end from the logging API in very flexible ways that allow you to change almost any aspect of the generated code. It would be completely impractical for you to try to recreate the features of Chronicles within LibP2P.

I still firmly believe you are rooted in the traditional ways of thinking because you don't recognise how Chronicles and Metrics can be adapted to any back-end. Please think about my remark that the the goal of these libraries is to capture perfectly the intention of the library author. Once you've captured the intention, you can adapt the generated code to any regulatory requirement through compile-time mechanisms offered by these libraries. They are the "standard interface".

shipping everything as one package

People tend to put high value on the "no dependencies" badge, because in some eco-systems the dependencies can be hard to manage (again, C++ being the prime example here). This is another case of using common wisdom that might not necessarily be relevant any more.

risks associated with external dependencies

Let's not get carried away. We are actually discussing libraries developed by your fellow Status developers.

sinkingsugar commented 4 years ago

Let's not get carried away. We are actually discussing libraries developed by your fellow Status developers.

While I don't really completely agree with the rest, this last sentence definitely is true and so we should probably not set this issue/discussion as critical.

dryajov commented 4 years ago

Let's not get carried away. We are actually discussing libraries developed by your fellow Status developers.

Well, I obviously meant this in the general case, I hope that isn't open to interpretation, so I reiterate, I'm talking about packages managers and dependencies in general in that particular sentence (and that reply).

dryajov commented 4 years ago

Chronicles already support the equivalent of anything you can do with an ILogger interface. Please provide any challenge and I'll explain how it's done in Chronicles.

Great, I suspected that, and it means we can adopt it as the standard interface while still leaving the backend up to the user to choose.

zah commented 4 years ago

Great, I suspected that, and it means we can adopt it as the standard interface while still leaving the backend up to the user to choose.

I'm glad that we reached agreement. Now, my point is that the same should be true for nim-metrics. Even if nim-metrics is not configurable enough yet, the API is such that all of the same compile-time configuration with a re-targettable back-end is possible there as well.

We can assume that it will be the standard interface for mertrics and that metrics are practically free to add (because they can be disabled in the default configuration), so the more metrics we have, the better.

sinkingsugar commented 4 years ago

@dryajov What to do about this one? Should we add more "observers" to connections/streams etc actually?

dryajov commented 4 years ago

Mm, honestly I was hopping we can have a consistent api to instrument all of this. So I'd still keep this open for now.

For now we can keep going what we're doing, but eventually we should expose a saner api.

sinkingsugar commented 4 years ago

Not sure I understand tho, what a saner api would be? I think this is completely unrelated to trackers ( which are just a internal debug tool) and metrics ( which is a high level visualization tool ). That's why I mentioned observer pattern as a solution. But you have something else in mind ?