Closed amboar closed 11 months ago
One consideration is on naming. As much as prefixing 'phosphor-' on everything has lead to some grumbling, the original reason for doing that was that we were nearly certain that there would never be a package naming collision. Using nvmed
is much more likely that some day someone else makes another nvmed
and we get surprised by a Yocto update that adds a recipe that ends up taking precedent over ours.
(I ran into this problem recently in facebook/openbmc with a package we had named libnm
that ended up colliding with a package provided by the network-manager
recipe.)
I'm not suggesting we have to name this phosphor-foo
but we might want to pick a more unique name.
I would like to discuss a somehow more "generic" approach for daemon like this. Not only nvme has this issue, other peripherals like nic cards/gpus have the same situation, that likely we will have "nic-sensor, gpu-sensor", but also "nic-daemon", "gpu-daemon".
How about considering the above use cases as well?
Aren’t NICs and GPUs using PLDM? They’re certainly not using NVME-MI.
In my opinion, the vast majority of interesting code is unique to the protocol and not due to “I’m making a sensor” or “I’ve got some inventory”. Those common aspects should, in my opinion, be mostly covered by the generated bindings anyhow.
I am not referring to the protocol. The idea here is that for devices like NVMe/NIC/GPU, BMC will need to not only create sensors but also do other management, e.g. inventory, fault monitor, etc.
Such functions do not fit in dbus-sensors, which is why @amboar raises the proposal here. I would like to discuss a generalized approach to handle such use case.
What is the potential code sharing though? I'm not understanding what a 'generalized approach' would look like if the protocols are different and the code makes common dbus interfaces.
My current expectation is that GPU and NICs should/would be supported by openbmc/pldm.
Why would other management functions not just be done through a whole separate daemon. The original idea with dbus is that you could have multiple daemons providing the same objects with different interfaces. You could still provide the sensors bits here.
I am not referring to the protocol. The idea here is that for devices like NVMe/NIC/GPU, BMC will need to not only create sensors but also do other management, e.g. inventory, fault monitor, etc.
Such functions do not fit in dbus-sensors, which is why @amboar raises the proposal here. I would like to discuss a generalized approach to handle such use case.
This is a bit of a "useful reduction" of my argument in the linked design document. I'll bring the three points I was raising over here:
Your argument is around point 1, but somewhat disregards point 2 and 3. It's not clear to me whether you're suggesting adding GPU and NIC support into the repository I'm proposing here, but if you are that runs afoul of point 3 by requiring unrelated, significant, domain-specific knowledge to work with it. I'm trying to reduce the scope of the three specific problems I outlined in the points above by isolating the nvmesensor code.
I don't think there are significant issues with reassessing at some later point whether there is infrastructure that can be shared. I'm happy for those patterns to emerge from the state of the ecosystem rather than try to enforce some top-down structure from the start when it's unclear what it is we're trying to generalise. Please create a separate issue for discussion or a vote on your concerns. Let's try to keep the scope tight here - I prefer that "generalising now" is a non-goal for this TOF issue.
Why would other management functions not just be done through a whole separate daemon. The original idea with dbus is that you could have multiple daemons providing the same objects with different interfaces. You could still provide the sensors bits here.
The NVMe-MI provides the whole stack of functionalities including but beyond the thermal reading. In the other word, the thermal reading is embedded in the NVMe architecture and the health reporting mechanism, which is not suitable to split into multiple daemons.
As for the original NVMe BASIC based nvmesensor. The BASIC protocol is more like a premature and simplified version before NVMe-MI 1.0. But the status report definition is similar to NVMe-MI and worth sharing some of the implementation.
Quoting Andrew's proposal:
Prior to completion and public availability of the NVMe-MI 1.0 specification, we have received feedback that a standardized method is needed by the industry to poll an NVMe device for basic health status information and which only requires SMBus slave support. The intent of the technical note is to publicly release a standardized approach for such a capability.
Why would other management functions not just be done through a whole separate daemon. The original idea with dbus is that you could have multiple daemons providing the same objects with different interfaces. You could still provide the sensors bits here.
@wak-google initially I interpreted your comment here as a response to @mine260309 but in case it wasn't, I would like to avoid a third NVMe stack in the project, as outlined in the design document. That's the motivation for chopping out nvmesensor and extending it (in addition to @drakedog2008 's response above).
One consideration is on naming. As much as prefixing 'phosphor-' on everything has lead to some grumbling, the original reason for doing that was that we were nearly certain that there would never be a package naming collision. Using
nvmed
is much more likely that some day someone else makes anothernvmed
and we get surprised by a Yocto update that adds a recipe that ends up taking precedent over ours.(I ran into this problem recently in facebook/openbmc with a package we had named
libnm
that ended up colliding with a package provided by thenetwork-manager
recipe.)I'm not suggesting we have to name this
phosphor-foo
but we might want to pick a more unique name.
@williamspatrick so tackling this concern, we already have phosphor-nvme
. I will argue that it takes phopshor-nvmed
out as a potential name as they are far too similar. So we'll need something else.
A whimsical approach might be to find a representative word, e.g.:
$ grep -w 'n.*v.*m.*e' /usr/share/dict/words
nevermore
Rumour has it this technique was used to name the samba project. Existing practice for whimsical naming in OpenBMC includes swampd
from openbmc/phosphor-pid-control. If others have suggestions, please list them for consideration.
Based on @edtanous' feedback on the proposal document I'm choosing to defer this for now. We can experiment with some of the strategies he outlined and reassess whether this is necessary.
Thanks for the input.
Design Document
Naming
Maintainers