Closed adamantinum closed 1 year ago
I've never tried this myself, so I'm not really sure if it should work. You get sent to a shell, you say? Is the root filesystem not mounting at all? What happens if you try to mount it manually from the emergency shell?
Yes, I thought to do the same. The strange thing is that the kernel doesn't seem to detect the device at all (at least nothing is shown in /dev) so I cannot mount the root by hand, although lsmod shows the apfs module as loaded. If I add to MODULES the vfat module too, neither the boot partition is detected... so I don't know what to think, maybe the unknown APFS filesystem messes something up...
I don't think it's an installation issue. since following the same process but using a "common" filesystem such as btrfs everything works.
as reference
Weird, did you check dmesg?
I think it was an issue with kernel parameters, using systemd-boot there is still the issue, booting directly a unified kernel image the kernel is able to mount the root, start some services and then system enters in Emergency Mode. I suppose it's an improvement.
Ok, just a bit more of context, it signals a NULL point dereference: https://pastebin.com/J1q1sg46
No idea what's going on, but it's not very surprising that there are problems with my mmap. If you want to help me debug this, I guess I can send you patches with printf statements to get more info. Otherwise I'll try to test this myself, but it may take a few days. By the way, what's your kernel version?
I would be happy to help! The kernel version is 6.4.8
That's great, thanks. This patch should at least tell us where the npd is happening (assuming it doesn't break anything new):
diff --git a/extents.c b/extents.c
index bc91d28..357c351 100644
--- a/extents.c
+++ b/extents.c
@@ -42,10 +42,17 @@ static inline u64 apfs_size_to_blocks(struct super_block *sb, u64 size)
int apfs_extent_from_query(struct apfs_query *query,
struct apfs_file_extent *extent)
{
- struct super_block *sb = query->node->object.sb;
- char *raw = query->node->object.data;
+ struct super_block *sb = NULL;
+ char *raw = NULL;
u64 ext_len;
+ BUG_ON(!query);
+ BUG_ON(IS_ERR(query));
+ BUG_ON(!query->node);
+ BUG_ON(IS_ERR(query->node));
+ sb = query->node->object.sb;
+ raw = query->node->object.data;
+
if(!apfs_is_sealed(sb)) {
struct apfs_file_extent_val *ext = NULL;
struct apfs_file_extent_key *ext_key = NULL;
@@ -292,7 +299,7 @@ static int apfs_range_put_reference(struct super_block *sb, u64 paddr, u64 lengt
*/
static int apfs_shrink_extent_head(struct apfs_query *query, struct apfs_dstream_info *dstream, u64 start)
{
- struct super_block *sb = query->node->object.sb;
+ struct super_block *sb = NULL;
struct apfs_file_extent_key key;
struct apfs_file_extent_val val;
struct apfs_file_extent extent;
@@ -300,6 +307,14 @@ static int apfs_shrink_extent_head(struct apfs_query *query, struct apfs_dstream
void *raw = NULL;
int err = 0;
+ BUG_ON(!dstream);
+ BUG_ON(IS_ERR(dstream));
+ BUG_ON(!query);
+ BUG_ON(IS_ERR(query));
+ BUG_ON(!query->node);
+ BUG_ON(IS_ERR(query->node));
+ sb = query->node->object.sb;
+
err = apfs_extent_from_query(query, &extent);
if (err) {
apfs_err(sb, "bad extent record for dstream 0x%llx", dstream->ds_id);
@@ -342,13 +357,21 @@ static int apfs_shrink_extent_head(struct apfs_query *query, struct apfs_dstream
*/
static int apfs_shrink_extent_tail(struct apfs_query *query, struct apfs_dstream_info *dstream, u64 end)
{
- struct super_block *sb = query->node->object.sb;
+ struct super_block *sb = NULL;
struct apfs_file_extent_val *val;
struct apfs_file_extent extent;
u64 new_len, new_blkcount, tail_len;
void *raw;
int err = 0;
+ BUG_ON(!dstream);
+ BUG_ON(IS_ERR(dstream));
+ BUG_ON(!query);
+ BUG_ON(IS_ERR(query));
+ BUG_ON(!query->node);
+ BUG_ON(IS_ERR(query->node));
+ sb = query->node->object.sb;
+
ASSERT((end & (sb->s_blocksize - 1)) == 0);
err = apfs_query_join_transaction(query);
@@ -391,9 +414,15 @@ static int apfs_shrink_extent_tail(struct apfs_query *query, struct apfs_dstream
*/
static inline bool apfs_query_found_extent(struct apfs_query *query)
{
- void *raw = query->node->object.data;
+ void *raw = NULL;
struct apfs_key_header *hdr;
+ BUG_ON(!query);
+ BUG_ON(IS_ERR(query));
+ BUG_ON(!query->node);
+ BUG_ON(IS_ERR(query->node));
+ raw = query->node->object.data;
+
if (query->key_len < sizeof(*hdr))
return false;
hdr = raw + query->key_off;
@@ -420,6 +449,9 @@ static int apfs_update_tail_extent(struct apfs_dstream_info *dstream, const stru
int ret;
u64 new_crypto;
+ BUG_ON(!extent);
+ BUG_ON(IS_ERR(extent));
+
apfs_key_set_hdr(APFS_TYPE_FILE_EXTENT, extent_id, &raw_key);
raw_key.logical_addr = cpu_to_le64(extent->logical_addr);
raw_val.len_and_flags = cpu_to_le64(extent->len);
@@ -433,6 +465,8 @@ static int apfs_update_tail_extent(struct apfs_dstream_info *dstream, const stru
/* We want the last extent record */
apfs_init_file_extent_key(extent_id, -1, &key);
+ BUG_ON(!sbi->s_cat_root);
+ BUG_ON(IS_ERR(sbi->s_cat_root));
query = apfs_alloc_query(sbi->s_cat_root, NULL /* parent */);
if (!query)
return -ENOMEM;
@@ -528,7 +562,7 @@ out:
*/
static int apfs_split_extent(struct apfs_query *query, u64 div)
{
- struct super_block *sb = query->node->object.sb;
+ struct super_block *sb = NULL;
struct apfs_file_extent_val *val1;
struct apfs_file_extent_key key2;
struct apfs_file_extent_val val2;
@@ -537,6 +571,12 @@ static int apfs_split_extent(struct apfs_query *query, u64 div)
void *raw;
int err = 0;
+ BUG_ON(!query);
+ BUG_ON(IS_ERR(query));
+ BUG_ON(!query->node);
+ BUG_ON(IS_ERR(query->node));
+ sb = query->node->object.sb;
+
err = apfs_query_join_transaction(query);
if (err) {
apfs_err(sb, "query join failed");
@@ -597,6 +637,9 @@ static int apfs_update_mid_extent(struct apfs_dstream_info *dstream, const struc
bool second_run = false;
int ret;
+ BUG_ON(!extent);
+ BUG_ON(IS_ERR(extent));
+
apfs_key_set_hdr(APFS_TYPE_FILE_EXTENT, extent_id, &raw_key);
raw_key.logical_addr = cpu_to_le64(extent->logical_addr);
raw_val.len_and_flags = cpu_to_le64(extent->len);
@@ -610,6 +653,8 @@ static int apfs_update_mid_extent(struct apfs_dstream_info *dstream, const struc
apfs_init_file_extent_key(extent_id, extent->logical_addr, &key);
search_and_insert:
+ BUG_ON(!sbi->s_cat_root);
+ BUG_ON(IS_ERR(sbi->s_cat_root));
query = apfs_alloc_query(sbi->s_cat_root, NULL /* parent */);
if (!query)
return -ENOMEM;
@@ -765,6 +810,9 @@ static int apfs_extend_phys_extent(struct apfs_query *query, u64 bno, u64 blkcnt
struct apfs_phys_ext_val raw_val;
u64 kind = (u64)APFS_KIND_NEW << APFS_PEXT_KIND_SHIFT;
+ BUG_ON(!query);
+ BUG_ON(IS_ERR(query));
+
apfs_key_set_hdr(APFS_TYPE_EXTENT, bno, &raw_key);
raw_val.len_and_kind = cpu_to_le64(kind | blkcnt);
raw_val.owning_obj_id = cpu_to_le64(dstream_id);
@@ -778,6 +826,9 @@ static int apfs_insert_new_phys_extent(struct apfs_query *query, u64 bno, u64 bl
struct apfs_phys_ext_val raw_val;
u64 kind = (u64)APFS_KIND_NEW << APFS_PEXT_KIND_SHIFT;
+ BUG_ON(!query);
+ BUG_ON(IS_ERR(query));
+
apfs_key_set_hdr(APFS_TYPE_EXTENT, bno, &raw_key);
raw_val.len_and_kind = cpu_to_le64(kind | blkcnt);
raw_val.owning_obj_id = cpu_to_le64(dstream_id);
@@ -807,6 +858,9 @@ static int apfs_insert_phys_extent(struct apfs_dstream_info *dstream, const stru
u64 last_bno, new_base, new_blkcnt;
int ret;
+ BUG_ON(!extent);
+ BUG_ON(IS_ERR(extent));
+
extref_root = apfs_read_node(sb,
le64_to_cpu(vsb_raw->apfs_extentref_tree_oid),
APFS_OBJ_PHYSICAL, true /* write */);
@@ -814,6 +868,8 @@ static int apfs_insert_phys_extent(struct apfs_dstream_info *dstream, const stru
apfs_err(sb, "failed to read extref root 0x%llx", le64_to_cpu(vsb_raw->apfs_extentref_tree_oid));
return PTR_ERR(extref_root);
}
+ BUG_ON(!extref_root);
+ BUG_ON(IS_ERR(extref_root));
apfs_assert_in_transaction(sb, &vsb_raw->apfs_o);
vsb_raw->apfs_extentref_tree_oid = cpu_to_le64(extref_root->object.oid);
@@ -904,10 +960,17 @@ out:
*/
static int apfs_phys_ext_from_query(struct apfs_query *query, struct apfs_phys_extent *pext)
{
- struct super_block *sb = query->node->object.sb;
+ struct super_block *sb = NULL;
struct apfs_phys_ext_key *key;
struct apfs_phys_ext_val *val;
- char *raw = query->node->object.data;
+ char *raw = NULL;
+
+ BUG_ON(!query);
+ BUG_ON(IS_ERR(query));
+ BUG_ON(!query->node);
+ BUG_ON(IS_ERR(query->node));
+ sb = query->node->object.sb;
+ raw = query->node->object.data;
if (query->len != sizeof(*val) || query->key_len != sizeof(*key)) {
apfs_err(sb, "bad length of key (%d) or value (%d)", query->key_len, query->len);
@@ -952,11 +1015,19 @@ static int apfs_free_phys_ext(struct super_block *sb, struct apfs_phys_extent *p
*/
static int apfs_put_phys_extent(struct apfs_phys_extent *pext, struct apfs_query *query)
{
- struct super_block *sb = query->node->object.sb;
+ struct super_block *sb = NULL;
struct apfs_phys_ext_val *val;
void *raw;
int err;
+ BUG_ON(!pext);
+ BUG_ON(IS_ERR(pext));
+ BUG_ON(!query);
+ BUG_ON(IS_ERR(query));
+ BUG_ON(!query->node);
+ BUG_ON(IS_ERR(query->node));
+ sb = query->node->object.sb;
+
if (--pext->refcnt == 0) {
err = apfs_btree_remove(query);
if (err) {
@@ -1030,7 +1101,7 @@ static inline void apfs_set_phys_ext_length(struct apfs_phys_ext_val *pext, u64
*/
static int apfs_split_phys_ext(struct apfs_query *query, u64 div)
{
- struct super_block *sb = query->node->object.sb;
+ struct super_block *sb = NULL;
struct apfs_phys_ext_val *val1;
struct apfs_phys_ext_key key2;
struct apfs_phys_ext_val val2;
@@ -1039,6 +1110,12 @@ static int apfs_split_phys_ext(struct apfs_query *query, u64 div)
void *raw;
int err = 0;
+ BUG_ON(!query);
+ BUG_ON(IS_ERR(query));
+ BUG_ON(!query->node);
+ BUG_ON(IS_ERR(query->node));
+ sb = query->node->object.sb;
+
err = apfs_query_join_transaction(query);
if (err) {
apfs_err(sb, "query join failed");
@@ -1173,6 +1250,8 @@ static int apfs_create_hole(struct apfs_dstream_info *dstream, u64 start, u64 en
raw_val.crypto_id = cpu_to_le64(apfs_vol_is_encrypted(sb) ? extent_id : 0);
apfs_init_file_extent_key(extent_id, start, &key);
+ BUG_ON(!sbi->s_cat_root);
+ BUG_ON(IS_ERR(sbi->s_cat_root));
query = apfs_alloc_query(sbi->s_cat_root, NULL /* parent */);
if (!query)
return -ENOMEM;
@@ -1227,10 +1306,13 @@ static int apfs_zero_dstream_tail(struct apfs_dstream_info *dstream)
}
/* This will take care of the CoW and zeroing */
+ apfs_alert(sb, "Before __apfs_write_begin()...");
err = __apfs_write_begin(NULL, inode->i_mapping, inode->i_size, 0, 0, &page, &fsdata);
if (err)
return err;
- return __apfs_write_end(NULL, inode->i_mapping, inode->i_size, 0, 0, page, fsdata);
+ err = __apfs_write_end(NULL, inode->i_mapping, inode->i_size, 0, 0, page, fsdata);
+ apfs_alert(sb, "After __apfs_write_end()...");
+ return err;
}
/**
@@ -1242,6 +1324,12 @@ static int apfs_zero_dstream_tail(struct apfs_dstream_info *dstream)
static void apfs_zero_bh_tail(struct super_block *sb, struct buffer_head *bh, u64 length)
{
ASSERT(buffer_trans(bh));
+
+ BUG_ON(!bh->b_data);
+ BUG_ON(IS_ERR(bh->b_data));
+ BUG_ON(!(bh->b_data + length));
+ BUG_ON(IS_ERR(bh->b_data + length));
+
if (length < sb->s_blocksize)
memset(bh->b_data + length, 0, sb->s_blocksize - length);
}
@@ -1270,6 +1358,8 @@ static int apfs_range_in_snap(struct super_block *sb, u64 bno, u64 blkcnt, bool
return 0;
}
+ BUG_ON(1);
+
/*
* Now check if the current physical extent tree has an entry for
* these blocks
@@ -1360,13 +1450,22 @@ static int apfs_dstream_cache_in_snap(struct apfs_dstream_info *dstream, bool *i
*/
static int apfs_dstream_get_new_block(struct apfs_dstream_info *dstream, u64 dsblock, struct buffer_head *bh_result, u64 *bno)
{
- struct super_block *sb = dstream->ds_sb;
- struct apfs_superblock *vsb_raw = APFS_SB(sb)->s_vsb_raw;
+ struct super_block *sb = NULL;
+ struct apfs_superblock *vsb_raw = NULL;
struct apfs_file_extent *cache = NULL;
u64 phys_bno, logical_addr, cache_blks, dstream_blks;
bool in_snap = true;
int err;
+ BUG_ON(!dstream);
+ BUG_ON(IS_ERR(dstream));
+ sb = dstream->ds_sb;
+ BUG_ON(!sb);
+ BUG_ON(IS_ERR(sb));
+ vsb_raw = APFS_SB(sb)->s_vsb_raw;
+ BUG_ON(!vsb_raw);
+ BUG_ON(IS_ERR(vsb_raw));
+
/* TODO: preallocate tail blocks */
logical_addr = dsblock << sb->s_blocksize_bits;
@@ -1477,7 +1576,11 @@ int apfs_dstream_get_new_bno(struct apfs_dstream_info *dstream, u64 dsblock, u64
int apfs_get_new_block(struct inode *inode, sector_t iblock,
struct buffer_head *bh_result, int create)
{
- struct apfs_inode_info *ai = APFS_I(inode);
+ struct apfs_inode_info *ai = NULL;
+
+ BUG_ON(!inode);
+ BUG_ON(IS_ERR(inode));
+ ai = APFS_I(inode);
ASSERT(create);
return apfs_dstream_get_new_block(&ai->i_dstream, iblock, bh_result, NULL /* bno */);
@@ -1814,17 +1917,24 @@ static int apfs_put_single_extent(struct super_block *sb, u64 *paddr_end, u64 pa
struct apfs_file_extent tmp = {0}; /* TODO: clean up all the fake extent interfaces? */
int ret;
+ BUG_ON(!paddr_end);
+ BUG_ON(IS_ERR(paddr_end));
+
extref_root = apfs_read_node(sb, le64_to_cpu(vsb_raw->apfs_extentref_tree_oid), APFS_OBJ_PHYSICAL, true /* write */);
if (IS_ERR(extref_root)) {
apfs_err(sb, "failed to read extref root 0x%llx", le64_to_cpu(vsb_raw->apfs_extentref_tree_oid));
return PTR_ERR(extref_root);
}
+ BUG_ON(!extref_root);
+ BUG_ON(IS_ERR(extref_root));
apfs_assert_in_transaction(sb, &vsb_raw->apfs_o);
vsb_raw->apfs_extentref_tree_oid = cpu_to_le64(extref_root->object.oid);
apfs_init_extent_key(*paddr_end - 1, &key);
restart:
+ BUG_ON(!extref_root);
+ BUG_ON(IS_ERR(extref_root));
query = apfs_alloc_query(extref_root, NULL /* parent */);
if (!query) {
ret = -ENOMEM;
diff --git a/inode.c b/inode.c
index a12d4b3..d58d840 100644
--- a/inode.c
+++ b/inode.c
@@ -350,6 +350,8 @@ int apfs_crypto_adj_refcnt(struct super_block *sb, u64 crypto_id, int delta)
if (!crypto_id)
return 0;
+ BUG_ON(!sbi->s_cat_root);
+ BUG_ON(IS_ERR(sbi->s_cat_root));
apfs_init_crypto_state_key(crypto_id, &key);
query = apfs_alloc_query(sbi->s_cat_root, NULL /* parent */);
if (!query)
Using the patch I get the same result... no other logs during boot. I can reach an emergency shell only booting with systemd.unit=emergency.target, because it is launched before the crash happens. This is the dmesg in the latter case, but since it's before the crash I don't know if it could be useful.
Using systemd.unit=rescue.target the crash happens, so it should be caused by something else than the mounting process itself.
No idea... This is going to flood your console but it should at least narrow it down a little. You can apply it on top of the other patch.
diff --git a/extents.c b/extents.c
index 357c351..a084b13 100644
--- a/extents.c
+++ b/extents.c
@@ -1457,6 +1457,8 @@ static int apfs_dstream_get_new_block(struct apfs_dstream_info *dstream, u64 dsb
bool in_snap = true;
int err;
+ apfs_alert(sb, "1");
+
BUG_ON(!dstream);
BUG_ON(IS_ERR(dstream));
sb = dstream->ds_sb;
@@ -1466,6 +1468,8 @@ static int apfs_dstream_get_new_block(struct apfs_dstream_info *dstream, u64 dsb
BUG_ON(!vsb_raw);
BUG_ON(IS_ERR(vsb_raw));
+ apfs_alert(sb, "2");
+
/* TODO: preallocate tail blocks */
logical_addr = dsblock << sb->s_blocksize_bits;
@@ -1474,17 +1478,20 @@ static int apfs_dstream_get_new_block(struct apfs_dstream_info *dstream, u64 dsb
apfs_err(sb, "block allocation failed");
return err;
}
+ apfs_alert(sb, "3");
apfs_assert_in_transaction(sb, &vsb_raw->apfs_o);
le64_add_cpu(&vsb_raw->apfs_fs_alloc_count, 1);
le64_add_cpu(&vsb_raw->apfs_total_blocks_alloced, 1);
if (bno)
*bno = phys_bno;
+ apfs_alert(sb, "4");
if (bh_result) {
apfs_map_bh(bh_result, sb, phys_bno);
err = apfs_transaction_join(sb, bh_result);
if (err)
return err;
+ apfs_alert(sb, "5");
if (!buffer_uptodate(bh_result)) {
/*
@@ -1499,7 +1506,9 @@ static int apfs_dstream_get_new_block(struct apfs_dstream_info *dstream, u64 dsb
*/
apfs_zero_bh_tail(sb, bh_result, dstream->ds_size - logical_addr);
}
+ apfs_alert(sb, "6");
}
+ apfs_alert(sb, "7");
dstream_blks = apfs_size_to_blocks(sb, dstream->ds_size);
if (dstream_blks < dsblock) {
@@ -1507,16 +1516,20 @@ static int apfs_dstream_get_new_block(struct apfs_dstream_info *dstream, u64 dsb
* This recurses into apfs_dstream_get_new_block() and dirties
* the extent cache, so it must happen before flushing it.
*/
+ apfs_alert(sb, "8");
err = apfs_zero_dstream_tail(dstream);
if (err) {
apfs_err(sb, "failed to zero tail for dstream 0x%llx", dstream->ds_id);
return err;
}
+ apfs_alert(sb, "9");
}
+ apfs_alert(sb, "10");
err = apfs_dstream_cache_in_snap(dstream, &in_snap);
if (err)
return err;
+ apfs_alert(sb, "11");
cache = &dstream->ds_cached_ext;
cache_blks = apfs_size_to_blocks(sb, cache->len);
@@ -1525,6 +1538,7 @@ static int apfs_dstream_get_new_block(struct apfs_dstream_info *dstream, u64 dsb
if (!in_snap && apfs_dstream_cache_is_tail(dstream) &&
logical_addr == cache->logical_addr + cache->len &&
phys_bno == cache->phys_block_num + cache_blks) {
+ apfs_alert(sb, "12");
cache->len += sb->s_blocksize;
dstream->ds_ext_dirty = true;
return 0;
@@ -1535,6 +1549,7 @@ static int apfs_dstream_get_new_block(struct apfs_dstream_info *dstream, u64 dsb
apfs_err(sb, "extent cache flush failed for dstream 0x%llx", dstream->ds_id);
return err;
}
+ apfs_alert(sb, "13");
if (dstream_blks < dsblock) {
/*
@@ -1542,17 +1557,20 @@ static int apfs_dstream_get_new_block(struct apfs_dstream_info *dstream, u64 dsb
* it must happen after the flush to avoid conflict with those
* extent operations.
*/
+ apfs_alert(sb, "14");
err = apfs_create_hole(dstream, dstream_blks, dsblock);
if (err) {
apfs_err(sb, "hole creation failed for dstream 0x%llx", dstream->ds_id);
return err;
}
+ apfs_alert(sb, "15");
}
cache->logical_addr = logical_addr;
cache->phys_block_num = phys_bno;
cache->len = sb->s_blocksize;
dstream->ds_ext_dirty = true;
+ apfs_alert(sb, "16");
return 0;
}
int APFS_GET_NEW_BLOCK_MAXOPS(void)
I have no explaination, but I get still the same output. I'm building the module with dkms and the sources in /usr/src/linux-apfs-* correspond to the patched ones. Obviously then I generate new initramfs and efi images.
From the emergency shell if I remount manually the root with mount -o rw,remount /
it works and crashes only trying to continue the boot pressing Ctrl+D.
If I try to remount the root starting a service such as
[Unit] Description=Remount APFS root
[Service] Type=exec ExecStart=/usr/bin/mount -o rw,remount /
it crashes with the same error.
You are probably using the same old build somehow, I never remember how dkms works. Maybe try to build the driver directly, just by running make?
I managed to reproduce this. It seems that there is some general problem with my mmap in newer kernels, nothing to do with the filesystem being root. I'll see if I can fix it and get back to you.
Oh ok, thanks. Well I saw dkms rebuilding it, so it should be the patched one. I'll try building it with make.
Don't worry about it, now that I can reproduce here it makes more sense for me to debug this myself. Thanks.
I just pushed a patch for this issue, if you can let me know if it helped. (It's on the development branch)
Thanks! I think I won't be able to test it until next week, but I'll try!
Ok I tested it. Still the same error, but now I have more logs: https://pastebin.com/y9TrDsYS
Are you sure you built the 0.3.4 tag? Because having the exact same error is very odd, and the messages you are getting seem to come from the patches I sent you before (you don't need to use those anymore).
I built the development branch, previously with the patch I had no further log messages. I'll perform a clean install to clear any doubt
Clean install and it works perfectly, thanks! :)
Really? Awesome, I thought it would be harder.
I did a little post to celebrate! https://www.reddit.com/r/linux/comments/15xndn6/apfs_root_with_linuxapfsrw_module_v034
I really appreciate your work and the module, as it is, is a big achievement, thank you for it.
I was questioning whether it would be theoretically possible to use an APFS root on a Linux system as well as it is for ntfs3. I tried to install Arch in chroot and there was no issue, I added apfs in MODULES inside /etc/mkinitcpio.conf and regenerated the initramfs. Booting I get only to be sent to an emergency shell. Might there be a concurrency issue in boot? Should I write a hook for mkinitcpio?