dart-lang / watcher

A file system watcher library for Dart.
https://pub.dev/packages/watcher
BSD 3-Clause "New" or "Revised" License
138 stars 35 forks source link

Add/enable Windows tests #124

Closed DanTup closed 2 years ago

DanTup commented 2 years ago

These tests currently crash on Windows for me - calling await watcher.ready results in:

SocketException: OS Error: Access is denied.
, errno = 5, port = 0

And an entry in event viewer:

Faulting application name: dart.exe, version: 0.0.0.0, time stamp: 0x61a5f0de
Faulting module name: ntdll.dll, version: 10.0.19041.1288, time stamp: 0xa280d1d6
Exception code: 0xc0000005
Fault offset: 0x0000000000063416
Faulting process ID: 0x710
Faulting application start time: 0x01d80bce9da4ec85
Faulting application path: N:\Apps\Dart\nightly\bin\dart.exe
Faulting module path: C:\WINDOWS\SYSTEM32\ntdll.dll
Report ID: e34a430f-91d6-468e-afa0-9ba0ef817621
Faulting package full name: 
Faulting package-relative application ID: 

I'm not sure whether it's just my machine though, so @jakemac53 could you allow this to run on the bots to see if it crashes there?

(I also enabled some other tests that pass with the exception of one, which is also marked as skip for mac-os with #21 which I think is a similar issue).

jakemac53 commented 2 years ago

could you allow this to run on the bots to see if it crashes there?

Since you have contributed now they should just automatically run now :).

DanTup commented 2 years ago

Oh yeah :-)

Same thing happened here... It's good that it's consistent, though the crash kinda concerns me (I just landed changes in the analysis server that's awaiting those ready futures!). I'll take a look shortly and see what I can find. Thanks!

DanTup commented 2 years ago

The issue is occurring when this sandbox teardown function tries to asynchronously delete the folder:

image

If I change it to a sync delete, it works:

image

So the issue is only with testing, not with the implementation (phew!).

I'll file an SDK issue with a repro for this. In the meantime, is there value in changing package:test_descriptor to do a sync delete here? (sadly the code isn't in this repo so it's not just a change here).

jakemac53 commented 2 years ago

I'll file an SDK issue with a repro for this. In the meantime, is there value in changing package:test_descriptor to do a sync delete here? (sadly the code isn't in this repo so it's not just a change here).

It is definitely sketchy that a sync delete is making it pass given we are awaiting the delete there. It makes me thing some other unawaited async task is possibly deleting the file?

The change to make it sync isn't really problematic so we would likely accept the change, but it does indicate some underlying issue somewhere.

DanTup commented 2 years ago

Actually, I can repro in isolation when using a sync delete too, but the error occurs after it completes. I'm not sure whether this is a package:watcher issue or an SDK issue (I see the Windows Directory Watcher has some timers and such in there).

import 'dart:io';

import 'package:watcher/src/directory_watcher/windows.dart';

Future<void> main() async {
  var dir1 = Directory.systemTemp.createTempSync('dart_test_').path;

  var watcher = WindowsDirectoryWatcher(dir1);
  watcher.events.listen((event) {});
  await watcher.ready;

  try {
    print('Deleting $dir1!');
    Directory(dir1).deleteSync();
    print('Deleted!');
  } catch (e) {
    print('ERROR! $e');
  }
}

This outputs:

Deleting C:\DANNYT~1\dart_test_227cc3cf!
Deleted!
Unhandled exception:
SocketException: Access is denied.

 (OS Error: Access is denied.

, errno = 5), port = 0
Exited (255)

Since it printed Deleted, it seems like it completed - but then something crashed with Access is Denied without a stack trace. @jakemac53 any ideas? It doesn't happen if you don't create the watcher, or if you don't delete the folder :-)

jakemac53 commented 2 years ago

Oh, is this just what happens on windows if you delete the directory out from under the watcher? What if you close the watcher before deleting it?

DanTup commented 2 years ago

Ah yes, cancelling the sub seems to fix it too. I'll update the tests. Not sure if that makes this a bug or not though?

jakemac53 commented 2 years ago

Not sure if that makes this a bug or not though?

Ya I don't know what the intended behavior is for this case. Some exception being thrown seems reasonable, but the one that is thrown here is not very useful.

DanTup commented 2 years ago

FileSystemException: Deletion failed, path = 'C:\Users\runneradmin\AppData\Local\Temp\dart_test_801d36eb' (OS Error: The directory is not empty. , errno = 145)

Oh boy, this old chestnut :-)

Windows is pretty bad for failing deletes that occur immediately after a folder has been used. I think the usual fix is either to ignore the error (perhaps writing a warning), or try a few times with a small delay. I don't know if either are reasonable to put into test_descriptor/sandbox though, as I'm not sure who uses it.

jakemac53 commented 2 years ago

Interesting that we haven't seen this on other repos... but maybe the watcher makes the issue come up more often or something?

If adding a retry mechanism fixes it I suppose we could do that. Does just waiting for some small # of ms at the end of each of the tests in this package resolve things in the time being?

Could possibly even do this in a tearDown.

DanTup commented 2 years ago

I actually misread the error above thinking it was "a process is using the file", however it esems like the issue was the same - unclosed watchers.

I added some code to call startClosingEventStream in a teardown in case a test hadn't called it, though I'm not entirely sure if that's reasonable based on its comment (if it were, it could be added to a global teardown and removed from the tests?).

Still, there's one remaining test that seems like a bug (or at least unexpected behaviour). I can mark this as skip on Windows and file an issue if the change otherwise seems ok.

DanTup commented 2 years ago

(FWIW I did first try just a delay but it didn't seem to help.. I don't see any of these most recent failures locally, so I don't know whether it's timing difference, Windows version differences, or something else)

jakemac53 commented 2 years ago

The last failure just looks like a timing difference to me, I am fine with skipping it on windows for now.

DanTup commented 2 years ago

@jakemac53 ok, I can add a skip - should I file an issue about it too? Are the changes to startClosingEventStream reasonable? (if so, I think startClosingEventStream could be dropped from all the tests that are calling it explicitly? although that seems odd to me).

jakemac53 commented 2 years ago

should I file an issue about it too

Ya I would, and link the issue in the skip message.

Are the changes to startClosingEventStream reasonable?

Yes I think so, they look fine to me.

(if so, I think startClosingEventStream could be dropped from all the tests that are calling it explicitly? although that seems odd to me).

This is less clear - I would probably not touch them. I feel like they are explicitly closing early to try and check that we don't throw if you close the watcher in the middle of some events coming through? I am not sure why else those calls would exist otherwise.

jakemac53 commented 2 years ago

Is this ready to land @DanTup ?

DanTup commented 2 years ago

Yep, if you're happy with it :-)