Open red-hood opened 6 years ago
I can reproduce the issue with:
"x86_64-linux"
Linux 4.14.83, NixOS, 18.09.1436.a7fd4310c0c (Jellyfish)
yes
yes
nix-env (Nix) 2.1.3
"nixos-18.09.1436.a7fd4310c0c"
/nix/var/nix/profiles/per-user/root/channels/nixos
lscpu
:
lscpu /home/andi
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 12
On-line CPU(s) list: 0-11
Thread(s) per core: 1
Core(s) per socket: 12
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 95
Model name: Intel(R) Atom(TM) CPU C3858 @ 2.00GHz
Stepping: 1
CPU MHz: 1999.981
CPU max MHz: 2000.0000
CPU min MHz: 800.0000
BogoMIPS: 4000.00
Virtualization: VT-x
L1d cache: 24K
L1i cache: 32K
L2 cache: 2048K
NUMA node0 CPU(s): 0-11
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg cx16 xtpr pdcm sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave rdrand lahf_lm 3dnowprefetch cpuid_fault epb cat_l2 ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust smep erms mpx rdt_a rdseed smap clflushopt intel_pt sha_ni xsaveopt xsavec xgetbv1 xsaves dtherm arat pln pts arch_capabilities
Using a real SSD, HDD, NVMe Disk or just a file in tmpfs as backing device of the encrypted volume the same issues appears:
root@nixos> dd if=/dev/zero of=/tmp/test.disk bs=1M count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB, 1000 MiB) copied, 1.3424 s, 781 MB/s
root@nixos> cryptsetup luksFormat /tmp/test.disk
WARNING!
========
This will overwrite data on /tmp/test.disk irrevocably.
Are you sure? (Type uppercase yes): YES
Enter passphrase for /tmp/test.disk:
Verify passphrase:
root@nixos> cryptsetup open /tmp/test.disk crypttest /home/andi
Enter passphrase for /tmp/test.disk:
root@nixos> dd if=/dev/zero of=/dev/mapper/crypttest bs=1M count=500
[HANG]
Same happens on an all tmpfs NixOS 18.09 LiveCD.
I can also confirm that the very same procedure does not happen on a current Ubuntu 18.04 LiveCD (ubuntu-18.04.1-live-server-amd64.iso
) with a 4.15.0-29-generic kernel.
The kernel config that was found in /boot can be seen at https://gist.github.com/andir/c9f3fd32675179ef60dd9e0ba9512a66.
There is an issue with Fedora regarding this since about a year https://bugzilla.redhat.com/show_bug.cgi?id=1522962
No solution was found but the issue description seems to match this issue very much.
When blacklisting the QAT modules as hinted by the linked bugzilla report the kernel no longer stalls when encrypting data.
boot.blacklistedKernelModules = [ "intel_qat" "qat_c3xxx" ];
According to https://01.org/sites/default/files/downloads/intelr-quickassist-technology/336211-008qatrelnotes.pdf section 3.1.2:
So using this module for any kind of dm-crypt related work is probably a bad idea. We should blacklist / remove it per default…
I'd like to thank you for reporting and documenting this issue, although I don't have any connection to NixOS.
After acquiring a new main board with an Intel Atom CPU like yours, writes were stalled for me as well. It took me two days to identify the board as root cause and five minutes to find this bug report. Now, with the blacklisted QAT modules, dm-crypt fully works again.
Thanks again!
(If you don't like this offtopic comment on this issue, feel free to remove it. I just wanted to say thanks)
Quick update:
Since I don't really need this QAT feature I simply disabled it in the BIOS at some point and I don't waste any more time on this issue.
Hello, I'm a bot and I thank you in the name of the community for opening this issue.
To help our human contributors focus on the most-relevant reports, I check up on old issues to see if they're still relevant. This issue has had no activity for 180 days, and so I marked it as stale, but you can rest assured it will never be closed by a non-human.
The community would appreciate your effort in checking if the issue is still valid. If it isn't, please close it.
If the issue persists, and you'd like to remove the stale label, you simply need to leave a comment. Your comment can be as simple as "still important to me". If you'd like it to get more attention, you can ask for help by searching for maintainers and people that previously touched related code and @ mention them in a comment. You can use Git blame or GitHub's web interface on the relevant files to find them.
Lastly, you can always ask for help at our Discourse Forum or at #nixos' IRC channel.
Issue description
When using either the 4.14 or 4.15 kernel from 18.03 with LUKS on a S-ATA SSD-Drive, writes to the crypt block device stall, the process issuing the write stays in the syscall forever. The kernel log shows the following entries:
For the 4.14 kernel:
Steps to reproduce
The problem occured originally when creating a BTRFS file system on the device, I just used dd to rule out any file system influences.
Technical details
Please run
nix-shell -p nix-info --run "nix-info -m"
and paste the results.I am currently running Ubuntu 18.04 on the machine (due to its native ZFS support), which works fine. I will run the info command once I am testing new NixOS kernels again.
Some hardware information
Kernel versions were:
The Ubuntu kernel, which is running fine with LUKS on this machine is: