MatrixAI / js-encryptedfs

Encrypted Filesystem for TypeScript/JavaScript Applications
https://polykey.com
Apache License 2.0
10 stars 3 forks source link

`EncryptedFS.rmdir` does not remove files on Windows #69

Closed emmacasolin closed 1 year ago

emmacasolin commented 1 year ago

Describe the bug

EncryptedFS.rmdir() does not work as expected on Windows. It removes directories but not files.


Update: The issue may in fact be that Linux and Mac are removing files when they shouldn't and Windows is actually behaving correctly. This would mean that the tests are wrong. The tests that fail on Windows due to this all throw an EEXIST error when trying to open a (new) file and create a file descriptor to it. When this happens the error looks something like this:

ErrorEncryptedFSError: EEXIST: file already exists, dir\file1

      2041 |             // Target already exists cannot be created exclusively
      2042 |             if (flags & constants.O_CREAT && flags & constants.O_EXCL) {
    > 2043 |               throw new errors.ErrorEncryptedFSError({
           |                     ^
      2044 |                 errno: errno.EEXIST,
      2045 |                 path: path as string,
      2046 |                 syscall: 'open',

      at constructor_._open (src/EncryptedFS.ts:2043:21)
      at src/EncryptedFS.ts:1885:15
      at Object.maybeCallback (src/utils.ts:405:12)
      at Object.<anonymous> (tests/EncryptedFS.concurrent.test.ts:832:12)

To Reproduce

Run the following script on Windows:

import fs from 'fs';
import os from 'os';
import path from 'path';
import Logger, { LogLevel, StreamHandler } from '@matrixai/logger';
import { DB } from '@matrixai/db';
import EncryptedFS from './src/EncryptedFS';
import * as utils from './src/utils';
import INodeManager from './src/inodes/INodeManager';

async function main() {
  const logger = new Logger(`${EncryptedFS.name} Concurrency`, LogLevel.WARN, [
      new StreamHandler(),
  ]);
  const dbKey: Buffer = utils.generateKeySync(256);
  let dataDir: string;
  let db: DB;
  let iNodeMgr: INodeManager;
  let efs: EncryptedFS;

  dataDir = await fs.promises.mkdtemp(
    path.join(os.tmpdir(), 'encryptedfs-test-'),
  );
  db = await DB.createDB({
    dbPath: dataDir,
    crypto: {
      key: dbKey!,
      ops: {
      encrypt: utils.encrypt,
      decrypt: utils.decrypt,
      },
    },
    // @ts-ignore - version of js-logger is incompatible (remove when js-db updates to 5.* here)
    logger: logger.getChild(DB.name),
  });
  iNodeMgr = await INodeManager.createINodeManager({
    db,
    logger: logger.getChild(INodeManager.name),
  });
  efs = await EncryptedFS.createEncryptedFS({
    db,
    iNodeMgr,
    logger,
  });

  const path1 = path.join('dir', 'file1');
  await efs.mkdir('dir');
  await efs.mkdir('dir/dir2');
  let fd = await efs.open(path1, 'wx+');
  await efs.close(fd);
  console.log('Dir exists before rmdir? ', await efs.exists('dir'));
  console.log('Dir2 exists before rmdir? ', await efs.exists('dir/dir2'));
  console.log('FD exists before rmdir? ', await efs.exists(path1));
  await efs.rmdir('dir', { recursive: true });
  console.log('Dir exists after rmdir? ', await efs.exists('dir'));
  console.log('Dir2 exists after rmdir? ', await efs.exists('dir/dir2'));
  console.log('FD exists after rmdir? ', await efs.exists(path1));

  await efs.stop();
  await fs.promises.rm(dataDir, {
    force: true,
    recursive: true,
  });
}

main()

Output:

Dir exists before rmdir?  true
Dir2 exists before rmdir?  true
FD exists before rmdir?  true
Dir exists after rmdir?  false
Dir2 exists after rmdir?  false
FD exists after rmdir?  true

Expected behavior

The file should not exist after rmdir is called. This is observed when running the same script on Linux:

Dir exists before rmdir?  true
Dir2 exists before rmdir?  true
FD exists before rmdir?  true
Dir exists after rmdir?  false
Dir2 exists after rmdir?  false
FD exists after rmdir?  false

Platform

Additional context

May be related to https://github.com/MatrixAI/Polykey-CLI/issues/14

CMCDragonkai commented 1 year ago

What is an FD?

CMCDragonkai commented 1 year ago

Rmdir is not supposed to remove files. See: https://nodejs.org/api/fs.html#fspromisesrmdirpath-options.

CMCDragonkai commented 1 year ago

The rmdir cannot remove file descriptors. It's not supposed to be able to do this. The API on node fs does not allow this.

CMCDragonkai commented 1 year ago

What tests are failing here due to this?

emmacasolin commented 1 year ago

What tests are failing here due to this?

So far it's all of the concurrency tests that have failures, but I haven't gone over all of the failures yet. I can put together a list.

CMCDragonkai commented 1 year ago

Yea please create list. And also identify the offending usage of rmdir on file descriptors. It should never be used on file descriptors.

On Linux and MacOS, it should be throwing ENOTDIR if you are using it with file descriptors.

tegefaulkes commented 1 year ago

Looking at https://nodejs.org/api/fs.html#fspromisesrmdirpath-options I think that this should be throwing an ENOENT or ENOTDIR error if the directories contain a file even with the recursive: true.

Not to be confused with rm witch will remove files when recursive and force are true.

emmacasolin commented 1 year ago

Yea please create list. And also identify the offending usage of rmdir on file descriptors. It should never be used on file descriptors.

On Linux and MacOS, it should be throwing ENOTDIR if you are using it with file descriptors.

It's not being directly supplied a file descriptor, rather rmdir is given a path to a directory and if recursive is set to true then any file descriptors within the directory/sub directories get removed as well.

CMCDragonkai commented 1 year ago

any file descriptors within the directory/sub directories get removed as well.

Directories and subdirectories cannot "contain" file descriptors.

CMCDragonkai commented 1 year ago

Can you point out a part of the EFS where it is doing this? I don't see any file descriptors within directories. There are no such thing. FDs are just numbers.

emmacasolin commented 1 year ago

The example script in the issue description here is essentially what's happening in all of the affected tests. The FD points to a file inside the directory that rmdir gets called on.

CMCDragonkai commented 1 year ago
  1. Deleting directories/files should work fine, because it has nothing to do with FDs. Even if an fd is opened on a subdirectory/file. Again deleting a directory and file will work.
  2. Any FD opened SHOULD be closed by the user.

The above 2 points are true independently of each other.

CMCDragonkai commented 1 year ago

Deleting directories and files is about deleting their hardlinks. There is no interaction with the FD system.

emmacasolin commented 1 year ago

Tests that are affected by this issue:

EncryptedFS.concurrent.test.ts

These are the ones that I'm fairly certain are affected by this. There are other test failures but whether or not they are related to this I'm not sure.

tegefaulkes commented 1 year ago

First note, rmdir doesn't have a --recursive flag in the console. The option for the node.fs implementation specifies that the recursive option is deprecated. So I don't think we should even support it.

If we want recursive behaviour then we should be using rm. There is already an issue for adding it in. MatrixAI/Polykey#60.

Some details from the nodejs documentation.

Testing with fs.promises.rmdir(path, {recursive: true}) results in all of the directories getting removed including the file.

So my suggest fix for this is removing the recursive option for rmdir and implementing rm with issue MatrixAI/Polykey#60. Any usage of rmdir with the recursive flag should be using rm instead.

tegefaulkes commented 1 year ago

The fact that it has different behaviour is weird. I'm going to look into this more.

As for the test failures. Looks like a lot of the stem from this problem. We're using rmdir all over the place to clean the state for the next part of the test. Since on windows the files are not removed along with the directories then we run into issues for the rest of the test. Usually a EEXISTS error.

There also seems to be a problem with permissions that needs exploring.

CMCDragonkai commented 1 year ago

First note, rmdir doesn't have a --recursive flag in the console. The option for the node.fs implementation specifies that the recursive option is deprecated. So I don't think we should even support it.

If we want recursive behaviour then we should be using rm. There is already an issue for adding it in. MatrixAI/Polykey#60.

Some details from the nodejs documentation.

  • Using fsPromises.rmdir() on a file (not a directory) results in the promise being rejected with an ENOENT error on Windows and an ENOTDIR error on POSIX.
  • recursive [<boolean>] If true, perform a recursive directory removal. In recursive mode, operations are retried on failure. DEPRECATED
  • If an EBUSYEMFILEENFILEENOTEMPTY, or EPERM error is encountered, Node.js retries the operation with a linear backoff wait of retryDelay milliseconds longer on each try. This option represents the number of retries. This option is ignored if the recursive option is not trueDefault: 0.

Testing with fs.promises.rmdir(path, {recursive: true}) results in all of the directories getting removed including the file.

So my suggest fix for this is removing the recursive option for rmdir and implementing rm with issue MatrixAI/Polykey#60. Any usage of rmdir with the recursive flag should be using rm instead.

Yes, see MatrixAI/Polykey#60. It's already an issue for this.

However rmdir had recursive option in the past, and I'd like to keep backwards compatibility for now.

CMCDragonkai commented 1 year ago

Yes this is why it doesn't make sense to me about any kind of file descriptor situation. @emmacasolin posted a reproduction up top, that you can try to see between matrix-win-1 and Linux.

Try using SSH it's alot faster. Instructions are in matrix-team.

tegefaulkes commented 1 year ago

I found the problem in the example. It's the usage of path.join(). the separator is different under windows.

In the example we use the following to create the directory state.

 const path1 = path.join('dir', 'file1');
  await efs.mkdir('dir');
  await efs.mkdir('dir/dir2');
  let fd = await efs.open(path1, 'wx+');
  await efs.close(fd);

In linux this is fine, however in windows the path.join is using the \ separator. This means that our directory structure actually looks like.

// under linux and mac
dir/
    dir2/
    file1

// under windows
dir/
    dir2/
'dir\\file1'

So the example under windows we're only removing the dir directory and contents. leading to our problem. This also explains what confused me before since AFAIK file1 shouldn't be able to exist if it's parent directories were removed. With the exception of hardlinks and open descriptors.

So the next question is. how does and should the encrypted fs handle this difference between platforms? I'll have to review how the paths work currently to know what's it's doing currently.

CMCDragonkai commented 1 year ago

Ah this is a simple problem to solve. Back in VFS, we always used our own custom path join. We never just use path.join. That changes depending on the OS.

There should be a pathJoin in our utils.ts, if there isn't, it should be.

CMCDragonkai commented 1 year ago

Make sure all tests and all src files are using utils.pathJoin.

const pathJoin = pathNode.posix ? pathNode.posix.join : pathNode.join;

And any pathResolve should also be using utils.pathResolve. It's already in the utils.

tegefaulkes commented 1 year ago

Yep, I'll make the changes now.

tegefaulkes commented 1 year ago

Is there a PR that exists for this issue? MatrixAI/Polykey#67? or make a new one?

CMCDragonkai commented 1 year ago

Yea use MatrixAI/Polykey#67.

tegefaulkes commented 1 year ago

The fix has been pushed to MatrixAI/Polykey#67. There are some failing tests there but I tested the fix on staging and it passed.

tegefaulkes commented 1 year ago

Fix has been moved to MatrixAI/Polykey#68 now.