nokyan / resources

Keep an eye on system resources
GNU General Public License v3.0
782 stars 58 forks source link

Support for NPU devices #302

Closed adrianboguszewski closed 1 month ago

adrianboguszewski commented 4 months ago

Is there an existing issue for this?

Is your feature request related to a problem? Please describe.

No response

Describe the solution you'd like

Many modern CPUs e.g. Intel Core Ultra 7 155H include integrated NPUs (Neural Processing Units). It would be nice to display the utilization of these devices in Resources like in Windows Task Manager.

Describe alternatives you've considered

No response

Additional context

image

nokyan commented 4 months ago

Hi, thanks for the issue. I believe you forgot to change the name when copy-pasting, this repo isn't quite Mission Center. ^^ If Intel exposes those statistics, it should be fairly straight-forward to implement that (hopefully), I'll see what I can do and what the kernel offers. :)

adrianboguszewski commented 4 months ago

@nokyan Oh, you're right, changed :)

@jwludzik do we expose that kind of information (e.g. utilization) in the driver?

adrianboguszewski commented 4 months ago

Linux NPU driver repo: https://github.com/intel/linux-npu-driver

m-falkowski commented 3 months ago

Hello @adrianboguszewski @nokyan,

I am going to prepare an example how to measure NPU utilization for next week.

m-falkowski commented 3 months ago

Hi @adrianboguszewski @nokyan,

NPU utilization may be calculated using device's sysfs file npu_busy_time_us that contains the time that the device spent executing jobs. NPU is considered 'busy' starting with a first job submitted to firmware and ending when there is no more jobs pending/executing.

To measure an utilization either calculate from npu_busy_time_us difference delta to see NPU active duration during workload or monitor utilization percentage by reading npu_busy_time_us periodically.

You may see the commit introducing this feature: accel/ivpu: Share NPU busy time in sysfs

This is a sample Bash code that should showcase a usage of utilization:

NPU_BUSY_TIME_PATH="/sys/devices/pci0000:00/0000:00:0b.0/npu_busy_time_us"
TIME_1=$(cat "$NPU_BUSY_TIME_PATH")
while true; do
  sleep "$SAMPLING_PERIOD"
  TIME_2=$(cat "$NPU_BUSY_TIME_PATH")
  clear
  DELTA=$(("$TIME_2" - "$TIME_1"))
  echo "NPU busy time: $TIME_2 us"
  echo "NPU busy time delta: $DELTA us"
  echo "NPU Utilization: $(( 100 * "$DELTA" / "$SAMPLING_PERIOD" / 1000000 ))%"
  TIME_1=$TIME_2
done
nokyan commented 3 months ago

@m-falkowski Thanks a lot! I'm afraid NPU support won't make it in Resources 1.6 though which will be released in about two weeks. I'll start implementing it after the release. :)

nokyan commented 1 month ago

Hi there, I've finally gotten around to implementing NPU support. Do you mind checking out the npu branch and see if it works?

m-falkowski commented 1 month ago

Hi there, I've finally gotten around to implementing NPU support. Do you mind checking out the npu branch and see if it works?

Hi @nokyan ,

Thank you for this change! I tested it and it works reliably but calculates the percentage utilization incorrectly. The utilization was not scaled so it was shown it thousands of percents.

https://github.com/nokyan/resources/blob/npu/src/utils/npu/intel.rs#L67-L82

    fn usage(&self) -> Result<f64> {
        let last_timestamp = self.last_busy_time_timestamp.get();
        let last_busy_time = self.last_busy_time_us.get();

        let new_timestamp = unix_as_millis();
        let new_busy_time = self
            .read_device_int("npu_busy_time_us")
            .map(|int| int as usize)?;

        self.last_busy_time_timestamp.set(new_timestamp);
        self.last_busy_time_us.set(new_busy_time);

        let delta_timestamp = new_timestamp.saturating_sub(last_timestamp) as f64;
        let delta_busy_time = new_busy_time.saturating_sub(last_busy_time) as f64;

        Ok(delta_busy_time / delta_timestamp)
    }

Changing the last division into Ok((delta_busy_time / delta_timestamp) / 1000.0) resolves the issue and utilization as shown below:

utilization

nokyan commented 1 month ago

Hi there, I've finally gotten around to implementing NPU support. Do you mind checking out the npu branch and see if it works?

Hi @nokyan ,

Thank you for this change! I tested it and it works reliably but calculates the percentage utilization incorrectly. The utilization was not scaled so it was shown it thousands of percents.

https://github.com/nokyan/resources/blob/npu/src/utils/npu/intel.rs#L67-L82

    fn usage(&self) -> Result<f64> {
        let last_timestamp = self.last_busy_time_timestamp.get();
        let last_busy_time = self.last_busy_time_us.get();

        let new_timestamp = unix_as_millis();
        let new_busy_time = self
            .read_device_int("npu_busy_time_us")
            .map(|int| int as usize)?;

        self.last_busy_time_timestamp.set(new_timestamp);
        self.last_busy_time_us.set(new_busy_time);

        let delta_timestamp = new_timestamp.saturating_sub(last_timestamp) as f64;
        let delta_busy_time = new_busy_time.saturating_sub(last_busy_time) as f64;

        Ok(delta_busy_time / delta_timestamp)
    }

Changing the last division into Ok((delta_busy_time / delta_timestamp) / 1000.0) resolves the issue and utilization as shown below:

utilization

Looks good! I'll fix that soon. Unfortunately the driver doesn't seem to expose anything but the compute utilization, no memory usage or frequencies yet. :/

adrianboguszewski commented 4 weeks ago

@nokyan I'm happy to see that :) When can we expect the next release with this feature on the board?

nokyan commented 4 weeks ago

@nokyan I'm happy to see that :) When can we expect the next release with this feature on the board?

I plan to release 1.7 with NPU support probably on 29 November :)

Martin-HZK commented 3 weeks ago

Hi there, I've finally gotten around to implementing NPU support. Do you mind checking out the npu branch and see if it works?

Hi @nokyan ,

Thank you for this change! I tested it and it works reliably but calculates the percentage utilization incorrectly. The utilization was not scaled so it was shown it thousands of percents.

https://github.com/nokyan/resources/blob/npu/src/utils/npu/intel.rs#L67-L82

    fn usage(&self) -> Result<f64> {
        let last_timestamp = self.last_busy_time_timestamp.get();
        let last_busy_time = self.last_busy_time_us.get();

        let new_timestamp = unix_as_millis();
        let new_busy_time = self
            .read_device_int("npu_busy_time_us")
            .map(|int| int as usize)?;

        self.last_busy_time_timestamp.set(new_timestamp);
        self.last_busy_time_us.set(new_busy_time);

        let delta_timestamp = new_timestamp.saturating_sub(last_timestamp) as f64;
        let delta_busy_time = new_busy_time.saturating_sub(last_busy_time) as f64;

        Ok(delta_busy_time / delta_timestamp)
    }

Changing the last division into Ok((delta_busy_time / delta_timestamp) / 1000.0) resolves the issue and utilization as shown below:

utilization

How can we have the NPU listed in the performance layout? My Ubuntu 24.04 does not even identify the device

nokyan commented 3 weeks ago

How can we have the NPU listed in the performance layout? My Ubuntu 24.04 does not even identify the device

I can't help you without any information. What NPU do you have? Are you trying out the latest commit on GitHub or the current release on Flathub? The current release on Flathub doesn't have this feature yet, it will be included with the next release. What kernel do you currently use? Please send me the output of uname -a in your terminal.

Martin-HZK commented 3 weeks ago

How can we have the NPU listed in the performance layout? My Ubuntu 24.04 does not even identify the device

I can't help you without any information. What NPU do you have? Are you trying out the latest commit on GitHub or the current release on Flathub? The current release on Flathub doesn't have this feature yet, it will be included with the next release. What kernel do you currently use? Please send me the output of uname -a in your terminal.

Thank you for your timely reply! The detailed system information are as follows:

Currently I am using Intel Corporation Meteor Lake NPU and the driver I installed is Linux NPU Driver v1.8.0 presented for release on GitHub. The OS I installed is Ubuntu 24.04 LTS

$ uname -a
Linux hzk-Martin 6.8.1+ #2 SMP PREEMPT_DYNAMIC Thu Oct 24 18:31:43 CST 2024 x86_64 x86_64 x86_64 GNU/Linux
nokyan commented 3 weeks ago

Are you using the current production release of Resources on Flathub or are you building and using the latest commit from GitHub?

Martin-HZK commented 3 weeks ago

Are you using the current production release of Resources on Flathub or are you building and using the latest commit from GitHub?

I have strictly adhere to the installation guideline on GitHub with its corresponding release version

nokyan commented 3 weeks ago

Could you run Resources from your terminal with the environment variable RUST_LOG=resources=debug set and send me the output?

Martin-HZK commented 3 weeks ago

Sry for my mistake. Currently the exact 'Resources' application is downloaded from Flatpak.

nokyan commented 3 weeks ago

Sry for my mistake. Currently the exact 'Resources' application is downloaded from Flatpak.

Thanks, no worries. The current release of Resources on Flathub does not have the NPU feature yet.

Martin-HZK commented 3 weeks ago

Sry for my mistake. Currently the exact 'Resources' application is downloaded from Flatpak.

Thanks, no worries. The current release of Resources on Flathub does not have the NPU feature yet.

So does any of the releases in GitHub support NPU? I didn't see any of the releases mentions NPU. Or which commit hash code will you recommend for NPU profiling?

nokyan commented 3 weeks ago

Sry for my mistake. Currently the exact 'Resources' application is downloaded from Flatpak.

Thanks, no worries. The current release of Resources on Flathub does not have the NPU feature yet.

So does any of the releases in GitHub support NPU? I didn't see any of the releases mentions NPU. Or which commit hash code will you recommend for NPU profiling?

The NPU support is already in the main branch, just not in a production release yet. The branch for NPU support has been merged in commit 5583d7d64d6b0f8d4ee0a5b639505a887341b462. In the README you can find instructions on how to build the latest commit of Resources yourself. It boils down to cloning the repo and running this in your terminal while being in the repo's root:

flatpak install org.gnome.Sdk//47 org.freedesktop.Sdk.Extension.rust-stable//24.08 org.gnome.Platform//47 org.freedesktop.Sdk.Extension.llvm18//24.08
flatpak-builder --user flatpak_app build-aux/net.nokyan.Resources.Devel.json
flatpak-builder --run flatpak_app build-aux/net.nokyan.Resources.Devel.json resources

Release 1.7, which will ship NPU support, will release most likely on 29 November to Flathub and of course will also get a GitHub tag and release.

Martin-HZK commented 3 weeks ago

Sry for my mistake. Currently the exact 'Resources' application is downloaded from Flatpak.

Thanks, no worries. The current release of Resources on Flathub does not have the NPU feature yet.

So does any of the releases in GitHub support NPU? I didn't see any of the releases mentions NPU. Or which commit hash code will you recommend for NPU profiling?

The NPU support is already in the main branch, just not in a production release yet. The branch for NPU support has been merged in commit 5583d7d64d6b0f8d4ee0a5b639505a887341b462. In the README you can find instructions on how to build the latest commit of Resources yourself. It boils down to cloning the repo and running this in your terminal while being in the repo's root:

flatpak install org.gnome.Sdk//47 org.freedesktop.Sdk.Extension.rust-stable//24.08 org.gnome.Platform//47 org.freedesktop.Sdk.Extension.llvm18//24.08
flatpak-builder --user flatpak_app build-aux/net.nokyan.Resources.Devel.json
flatpak-builder --run flatpak_app build-aux/net.nokyan.Resources.Devel.json resources

Release 1.7, which will ship NPU support, will release most likely on 29 November to Flathub and of course will also get a GitHub tag and release.

The newest commit returns result like this:

image

Why does this happened? It seems that the NPU info is not successfully collected for commit 215dd36

nokyan commented 3 weeks ago

You are running Linux 6.8, which I believe does not contain the support for the sysfs interface that allows Resources to track usage for Intel NPUs. You could try updating your kernel. It could also be that Ubuntu 24.04 just does not offer kernel that's new enough for this feature. In this case, you could either try upgrading to Ubuntu 24.10 or try to manually install a newer kernel, though the latter can be risky if you're not experienced with that.

Martin-HZK commented 3 weeks ago

Does that mean I cannot even access the NPU services with rebuilding the current linux kernel?

jwludzik commented 3 weeks ago

Yes. The Ubuntu releases is fixed to kernel version, see "Kernel release schedule " in https://ubuntu.com/kernel/lifecycle. The NPU utilization solution used in nokyan/resource (many thanks to @nokyan) utilize the patch 0adff3b0ef12483a79dc8415b94547853d26d1f3 that has been merged to Linux kernel v6.11. I am sorry for the inconvenience. The main recommendation is to use Ubuntu 24.10 as @nokyan mentioned previously

Martin-HZK commented 3 weeks ago

Yes. The Ubuntu releases is fixed to kernel version, see "Kernel release schedule " in https://ubuntu.com/kernel/lifecycle. The NPU utilization solution used in nokyan/resource (many thanks to @nokyan) utilize the patch 0adff3b0ef12483a79dc8415b94547853d26d1f3 that has been merged to Linux kernel v6.11. I am sorry for the inconvenience. The main recommendation is to use Ubuntu 24.10 as @nokyan mentioned previously

Thank you for your advice! The problem is solved with upgrading the kernel version to 6.11

adrianboguszewski commented 3 weeks ago

Wow! It looks super cool! Screenshot From 2024-10-30 14-43-51 Thanks, @nokyan for implementing that!

nokyan commented 3 weeks ago

Wow! It looks super cool! Screenshot From 2024-10-30 14-43-51 Thanks, @nokyan for implementing that!

I'm glad you like it!