Closed SandroVaroli closed 2 years ago
Since neither tpm2-abrmd
nor dbus
(required by tpm2-abrmd
) are installed on the container image, I would expect the TCTI module to fail. Similarly, the tpm2-tools
TRM is probably not present (and seems to be superseded by tpm2-tabrmd
). The DLL load should print any errors without crashing: ref. However, the fact that the connection attempt is falling through to TcpTpmDevice loading makes me think that the container cannot actually open /dev/tpm{,rm}0
for file operations: ref.
In your container, does the following work?
apt-get install -y tpm2-tools
tpm2_getcap algorithms
Thank for the response. I looked to the code you sent me and it is clear that the exception has been generated here in the TcpTpmDevice.Connect().
I'll modify my sw in order to make the test you suggest with tpm2-tools... at the moment if the module never reached the device twin has nothing to do so it crashes in order to be restarted by edgeAgent and having an error reported at iothub
I tried some more test...
I cannot Install tpm2-tools on the docker while it is running.
I looked at /dev/tpm{,rm}0
permission and I found that on the host
crw-rw---- 1 aziottpm root 10, 224 Jul 19 09:11 tpm0
crw-rw---- 1 aziottpm tss 253, 65536 Jul 19 09:11 tpmrm0
on the docker
crw-rw---- 1 996 root 10, 224 Jul 19 09:11 tpm0
crw-rw---- 1 996 111 253, 65536 Jul 19 09:11 tpmrm0
on docker the tss group id is 102 instead of 111 as on the host, while the user aziottpm on the docker does not exist.
it means that probably it's a permission issue. My question now is: How I have to set Dockerfile and/or module create options, permissions etc... in order to connect to iothub using tpm attestation from my module?
I also looked at edgeAgent /dev folder and there in no /dev/tpm{,rm}0
at all, however it connects to iothub... so it is possibile that the right way to connect to iothub could be different then using the DeviceAuthenticationWithTpm?
Thanks Sandro
here the error i got from inside the docker installing tpm2-tools in dockerfile and running the command suggested here
/app$ tpm2_getcap -c algorithms
** (process:114): CRITICAL **: 16:37:53.722: failed to allocate dbus proxy object: Could not connect: No such file or directory
ERROR: tcti init allocation routine failed for library: "tabrmd" options: "(null)"
ERROR: Could not load tcti, got: "tabrmd"
EdgeAgent
only uses DeviceAuthenticationWithTokenRefresh
for upstream connection, I believe: ref. Regarding the tpm2-tools error, try forcing the use of the device TCTI with tpm2_getcap -T device -c algorithms
----c
is due to version differences. Also due to version differences is the TCTI loader behavior, which for the version available in the container (3.1.3) appears to try loading tabrmd
first: ref.
One thing that you could change about your configuration options is the TPM exposure method:
{
"HostConfig": {
"Privileged": true,
"LogConfig": {
"Type": "json-file",
"Config": {
"max-size": "10m",
"max-file": "10"
}
},
"Binds": [
"tenovadevicemonitor_data:/app/data",
],
"Devices": [
{
"PathOnHost": "/dev/tpm0",
"PathOnContainer": "/dev/tpm0",
"CgroupPermissions": "rw"
},
{
"PathOnHost": "/dev/tpmrm0",
"PathOnContainer": "/dev/tpmrm0",
"CgroupPermissions": "rw"
}
]
}
}
According to this mailing list thread you should not need privileged containers for device mounts, but maybe start with "Privileged"
set to true
to begin. And, while I think it is unlikely to be the cause of issues, "CgroupPermissions"
is set to "rw"
instead of "rwm"
, which is what docker container create --device ${DEVICE} ...
sets by default.
Thanks for feedbak.
I configured create options as you suggested and from within the docker the /dev/tpm{,rm}0
devise still have the same permissions:
crw-rw---- 1 996 root 10, 224 Jul 19 09:11 tpm0
crw-rw---- 1 996 111 253, 65536 Jul 19 09:11 tpmrm0
My module still return the same error and the tpm2_getcap -T device -c algorithms
command still fail in the same way.
So I made the next changes to the dockerfile:
FROM mcr.microsoft.com/dotnet/sdk:5.0 AS build-env
WORKDIR /app
# Install the nuget credential provider
RUN wget -qO- https://raw.githubusercontent.com/Microsoft/artifacts-credprovider/master/helpers/installcredprovider.sh | bash
# TODO: remove hardcoded PAT
ARG FEED_ACCESSTOKEN
ARG FEED_USERNAME
ENV VSS_NUGET_EXTERNAL_FEED_ENDPOINTS {\"endpointCredentials\": [{\"endpoint\":\"https://pkgs.dev.azure.com/tenovadigital/_packaging/Tenova/nuget/v3/index.json\", \"username\":\"${FEED_USERNAME}\", \"password\":\"${FEED_ACCESSTOKEN}\"}]}
COPY . ./
RUN dotnet restore ./DeviceMonitor/DeviceMonitor.csproj --configfile nuget.config
COPY . ./
RUN dotnet publish ./DeviceMonitor/DeviceMonitor.csproj -c Release -o out
FROM mcr.microsoft.com/dotnet/runtime:5.0-buster-slim
WORKDIR /app
COPY --from=build-env /app/out ./
RUN groupadd -f -g 111 tss
RUN apt-get update && \
apt-get install -y libtss2-tcti-tabrmd0 && \
apt-get install -y tpm2-tools
RUN useradd -ms /bin/bash moduleuser && \
usermod -a -G tss moduleuser
RUN mkdir -p /app/data && chown -R moduleuser:moduleuser /app/data
VOLUME /app/data
USER moduleuser
ENTRYPOINT ["dotnet", "Tenova.PDM.DeviceMonitor.dll"]
Mainly I add the tss
gruop to the container forcing the same id of the host before tpm2-tools installation
RUN groupadd -f -g 111 tss
I also think tpm2-tolls is not needed at all (but I'll clean up later-on)
and then I assigned the user that is running the module application to the tss
group
usermod -a -G tss moduleuser
and the module was able to connect to the IotHub using TPM attestation. Here the log I got:
2022-07-21 09:46:05.8088 Debug HostingLoggerExtensions.Starting [1] - Hosting starting
Exception while loading tpm2-abrmd: System.DllNotFoundException: Unable to load shared library 'libtss2-tcti-tabrmd.so' or one of its dependencies. In order to help diagnose loading problems, consider setting the LD_DEBUG environment variable: liblibtss2-tcti-tabrmd.so: cannot open shared object file: No such file or directory
at Tpm2Lib.AbrmdWrapper.NativeMethods.Tss2_Tcti_Info()
at Tpm2Lib.AbrmdWrapper.Load(IntPtr& tctiCtxPtr)
at Tpm2Lib.LinuxTpmDevice..ctor(String tpmDevicePath)
2022-07-21 09:46:05.9327 Info TPMAuthenticationFactory.GetAuthenticationMethod [1] - Regisrtration id DMC_LINUX_VM_02
2022-07-21 09:46:05.9327 Info TPMAuthenticationFactory.GetAuthenticationMethod [1] - Endorsement key System.Byte[]
2022-07-21 09:46:06.0073 Info AzureDeviceClient..ctor [1] - Requesting Device Twin data
2022-07-21 09:46:07.9082 Trace AzureDeviceClient.<SetDesiredPropertyUpdateCallback>b__6_0 [4] - Desired Property Update Callback set sucessfull
as you can see the error about 'libtss2-tcti-tabrmd.so is still logged but the connection succeeded.
The point now is I cannot change my docker file for each different installation since the tss
group could have different Id on different Hosts. I have to install and provision maybe 100 device/year... I need something better than I hope the tss group will have always the same id
in my installation procedure.
There is s better and more robust solution? Some trustable workaround on this?
Thanks Sandro
What if you just ditch moduleuser?
Also . I installed these : tpm2-tools tpm2-abrmd
udev rule: KERNEL=="tpm0", SUBSYSTEM=="tpm", GROUP="1000", MODE="0660"
@emilm Thanks for the suggestion It appeared to be a good solution... Unfortunately it does not work for my module.
I noticed the there are udev rules for tmp....
# tpm devices can only be accessed by the tss user but the tss
# group members can access tpmrm devices
KERNEL=="tpm[0-9]*", MODE="0660", OWNER="tss"
KERNEL=="tpmrm[0-9]*", MODE="0660", OWNER="tss", GROUP="tss"
so I suppose adding moduleuser
to tss
group should work... but it does not....
However I added the rules that you suggested but it doesn't work It looks like udev rules are not effecting the behavior of the module
Added I also added installation of udev into docker file (just in case image does not contain it), but it does not work
@SandroVaroli my udev rule is in the host OS. sorry for not making that clear. But the 1000 is really the tss group. But it's not really known from within the container afaik. It's a long time since I made it work with DeviceClient + TPM from inside a container, but it's 100% doable with stability.
This is my complete install list, missed one. :
RUN apt-get update && apt install -y \
**libtss2-tcti-tabrmd-dev \**
tpm2-tools \
tpm2-abrmd \
nano \
dbus \
&& rm -rf /var/lib/apt/lists/*
In host OS , add KERNEL=="tpm[0-9]*", MODE="0660", GROUP="tss" (or 1000)
Double check the tss group: with cat /etc/group | awk -F ":" '{ print $1,$3 }' | grep tss
@emilm Thank a lot, with your last suggestion I'm near to solve the issue...
The point is:
the Id of the group tss
on the host must be the id of a group the moduleuser belongs to inside the docker...
this because the /dev/tpm{,rm}0
files inside the docker will inherit UID and GUI from the host. No way to make different.
So basically we have two solution:
assign to tss
group on the host the id 1000
that is the group of the user running the module. to do this:
#find the tss group id
cat /etc/group | awk -F ":" '{ print $1,$3 }' | grep tss
#change the tss group id allowing duplicated id
sudo groupmod -o -g 1000 tss
#change the group id for files belonging to tss
find / -gid <old_tss_id> -exec chgrp -v 1000 '{}' \;
That solution allow to keep easy and do not modify the dockerfile leaving it potentially agnostic on convention made on the host machine. The user running on the module will have rw permission on the Tpm devices.
The back-face is that on the host machine the group 1000 is the group normally used for the administrative user created during the system installation, so you will have that the tss
user will have the group permission on all the staff that are assigned to the admin group.
I mean; suppose that at installation time you create the user foo
. Automatically a group foo
is created with GID 1000 and all the files of the home directory of foo
are assigned to that group... After you assign 1000 as GID to group tss
there will be no distinction between the two groups (they will be same alias) and this could be configured as a security issue.
assign to tss
group on the host a conventional GID (let's say 1001) and create a group in the dokerfile with the same GUI and add the moduleuser to it.
changing the GID to tss group on the host following the next steps
#find the tss group id
cat /etc/group | awk -F ":" '{ print $1,$3 }' | grep tss
#change the tss group without allowing duplicated
sudo groupmod -g 1001 tss
#change the group id for files belonging to tss
find / -gid <old_tss_id> -exec chgrp -v 1001 '{}' \;
then in the docker file
RUN groupadd -f -g 1001 tpmaccess
RUN useradd -ms /bin/bash moduleuser && \ usermod -a -G tpmaccess moduleuser
This method is based on a convention that Docker and host must share. That is not very clean but it allow to have always the same `tss` group id on all the devices and having a docked using it without breaking security rules.
I'll clean the project and retry on a clean environment. Then I'll post the final working solution.
Thanks a lot to @emilm for his help. it has been needful to reach a solution
Sandro
I arrived to a final solution.
/dev/tpm{,rm}0
devices of the host must be bound the one of the docker. GID
is the same as the GID
of the tss
group of the hostIn my project:
The point 1 and 2 are solved as follow in the create option on the module
{
"HostConfig": {
"Privileged": true,
"Binds": [
"tenovadevicemonitor_data:/app/data",
"/dev/tpm0:/dev/tpm0",
"/dev/tpmrm0:/dev/tpmrm0"
]
}
}
For the point 3 the job is a bit more complex.
We could change the tss
GID
on the host to be 1000
as suggested by @emilm here, but since this is an administrative group with special privileges we choose a different way.
On the host we changed the GID
of tss
to ensure it will be the same on any installation to e.g. 2000
.
#find the tss group id
cat /etc/group | awk -F ":" '{ print $1,$3 }' | grep tss
#change the tss group without allowing duplicated
sudo groupmod -g 2000 tss
#change the group id for files belonging to tss
find / -gid <old_tss_id> -exec chgrp -v 2000 '{}' \;
finally we modified the docker file in order to add the moduleuser
to a group with the 2000
GID
.
FROM mcr.microsoft.com/dotnet/sdk:5.0 AS build-env
WORKDIR /app
# Install the nuget credential provider
RUN wget -qO- https://raw.githubusercontent.com/Microsoft/artifacts-credprovider/master/helpers/installcredprovider.sh | bash
# TODO: remove hardcoded PAT
ARG FEED_ACCESSTOKEN
ARG FEED_USERNAME
ENV VSS_NUGET_EXTERNAL_FEED_ENDPOINTS {\"endpointCredentials\": [{\"endpoint\":\"https://pkgs.dev.azure.com/tenovadigital/_packaging/Tenova/nuget/v3/index.json\", \"username\":\"${FEED_USERNAME}\", \"password\":\"${FEED_ACCESSTOKEN}\"}]}
COPY . ./
RUN dotnet restore ./DeviceMonitor/DeviceMonitor.csproj --configfile nuget.config
COPY . ./
RUN dotnet publish ./DeviceMonitor/DeviceMonitor.csproj -c Release -o out
FROM mcr.microsoft.com/dotnet/runtime:5.0-buster-slim
WORKDIR /app
COPY --from=build-env /app/out ./
# TPM ACCESS ENABLING -- START
RUN groupadd -f -g 2000 tpmaccess
RUN useradd -ms /bin/bash moduleuser && \
usermod -a -G tpmaccess moduleuser
# TPM ACCESS ENABLING -- END
RUN mkdir -p /app/data && chown -R moduleuser:moduleuser /app/data
VOLUME /app/data
USER moduleuser
ENTRYPOINT ["dotnet", "Tenova.PDM.DeviceMonitor.dll"]
That's all. If the module software only uses the DeviceClient
and Microsoft.Azure.Devices.Provisioning.Security.Tpm
no need to add tpm2-tools or other packages installation in the docker.
The next code will succeed:
var authenticationMethod = new DeviceAuthenticationWithTpm(deviceId, new SecurityProviderTpmHsm(deviceId));
_deviceClient = DeviceClient.Create(iotHubHostname,
authenticationMethod , settings);
One question to @onalante-msft can these simple information be added to the DeviceClient or IotEdge documentation? This will save tons of try and fail time to a lot of people.
Thanks Sandro Varoli
Sorry about missing the ping, I had a flurry of emails.
I can bring this up with the team to see which documentation kind this should be filed under; this is a use case somewhat outside the regular module operation scope, so it may be better suited for devdocs/advanced documentation.
@SandroVaroli what is it that your module needs to do as the DeviceClient? I'm looking to better understand what necessitates using DeviceClient instead of ModuleClient in your scenario.
@SandroVaroli ping
Maybe my case is useful: I wanted a module to receive Device Streams, and that's only possible through the Device Client.
Thanks for sharing, @emilm. In your case is the ultimate goal to enable secure access to the device (ssh) for things like remote troubleshooting / debugging? Or is it for something else?
Thanks for sharing, @emilm. In your case is the ultimate goal to enable secure access to the device (ssh) for things like remote troubleshooting / debugging? Or is it for something else?
Yes primarily! SSH also can tunnel to other devices on the device's network so that's used as well.
@micahl
Sorry I'm outside office in my summer holyday, so I had a delay before I can answer to you.
For me the point is this:
We have hundreds device around the world and there are several configurations depending on device
that we store in the twin.
For example, the scope of one of my module is to retrieve complex configuration of other modules from a set of APIs and distributes it to each module. This is done because configuration can be modified at any time by mean of a configuration portal we developed and that is available to our internal product and project engineers. Since we like to have a minimum of security, an AAD app registration is done for each device at creation time, so each device can access the right API set using its unique registration that could be revoked or renewed at any time. So these credentials are stored in device twin desired properties and the device also register a call back at any desired properties change.
Why we do not store these info inside module twin instead?
Because the only way to update the module twin is to create a new deployment
and this does not feet our needs. In Fact since we have hundreds of devices, we relay on layered deployments (more on less one layer for each possible module or microservice) and the same deployment could be assigned to tens of devices and when we make an update to the sw version of the container (and having at the moment more then twenty possible modules, it is something the happens quite often) it is easily distributed to all the devices that need it by cloning and updating the version in a single deployment.
If we should have one deployment for each device we will need more then one IotHub (due to deployments maximum limit) and update will be nightmare since we should clone and update hundreds of deployments a time... mainly loosing advantages of layered deployments.
I hope I was able to give you an idea of our need to connect using DeviceClient
instead of ModuleClient
.
This is just a quick example of the reason but maybe, if you think it is good to know, I will be more specific after middle of August on my coming back to office.
If the module software only uses the DeviceClient and Microsoft.Azure.Devices.Provisioning.Security.Tpm ...
Does this "module" only use the DeviceClient? I'm wondering whether for a scenario such as those described here an acceptable solution would be to create an IoT agent that uses the device identity instead of a module identity. The referenced doc discourages using a device identity in the general case, but if the agent is only intended for devices you own/operate then there's no reason you couldn't use a device identity. There's an example of doing so here: https://github.com/vslepakov/iot-identity-service-agent. Just be aware that the facilities offered by the IoT Edge runtime aren't available when interacting at that lower layer.
Thanks @micahl,
We will investigate IoT agent, however at a first impression it seams to have huge limitations
The Identity Service package does not provide non-Edge modules with the higher-level capabilities provided by the IoT Edge runtime.
Edge modules are managed through the IoT Edge runtime which enables:
- Deploying cloud workloads to an edge device
- Monitoring the health of the deployed Edge modules
- Gathering and reporting runtime quality metrics
- Operating offline for an indefinite period of time
- Managing communication between Edge modules
- Remotely restarting a module or retrieving logs
All this features, except communication between modules that take place using RabbitMq as communication middleware, are used and required for all our modules.
The sentence
If the module software only uses the DeviceClient and Microsoft.Azure.Devices.Provisioning.Security.Tpm ...
was referred to the TPM access. So, in the case tmp info are retrieved only by Microsoft.Azure.Devices.Provisioning.Security.Tpm, there will be no need to install further tpm2-tools packages inside docker container.
it seams to have huge limitations
Yeah, I think that answers my question on what else your module uses (or depends on).
Closing this as the original issue has been resolved. For any feature requests / suggestions please add to or upvote entries on https://aka.ms/iotedge-suggestion.
Thank you for the solution and the detailed explainantions.
Do you know where does the second "lib" (liblibtss2-tcti-tabrmd.so) comes from in the following message :
Exception while loading tpm2-abrmd: System.DllNotFoundException: Unable to load shared library 'libtss2-tcti-tabrmd.so' or one of its dependencies. In order to help diagnose loading problems, consider setting the LD_DEBUG environment variable:
liblibtss2-tcti-tabrmd.so: cannot open shared object file: No such file or directory
.
isn't that libtss2-tcti-tabrmd0 ?
Yes, but there are "lib" twice in the logs :
I don't know if the TSS tries to load the wrong filename (with "liblib") or if it is only a log mistake and the error above is not related to this "liblib" issue.
That's weird, you need to state the exact version of the .NET TPM libraries to and see in the source code if it's liblib there. Personally I gave up on using this, and made a REST client to talk directly to 1.4 aziot-tpmd. You can also try to copy libtss2-tcti-tabrmd.so to liblibtss2-tcti-tabrmd.so
Yes, I did the copy, and it worked! Thank you very much, I will consider the alternative of REST client as well.
I have Provisioned my Linux device using TPM. Everything worked fine and my modules have been successfully deployed on the iotedge. One of the modules must connect to the IotHub using DeviceClient in order to retrieve Device twin information and make some busyness logic with them. The connection using TPM fails. The working version of the sw that uses the symmetric key authentication is:
The not working version for the Tpm is:
when a device is provisioned using Tpm the registration ID is kept equal to the iotHub Device id. The Error logged by the module is:
I have already tried various steps but without success:
But none of the solutions (in any combination) worked and the error is still the same.
Some more info:
I looked for documentation, examples or any similar issue and I didn't find any suggestion nor indication about how to connect IoTEdge module to IotHub using TPM Authentication
Sandro