dokan-dev / dokany

User mode file system library for windows with FUSE Wrapper
http://dokan-dev.github.io
5.22k stars 663 forks source link

BSOD on IFSTest on AppVeyor Win2012R2 image #519

Closed Rondom closed 7 years ago

Rondom commented 7 years ago

Environment

Check List

Description

Running IFSTest (Win10Anniversary) on the AppVeyor VS2015 image causes a BSOD. I have not tried to reproduce it anywhere else. The corresponding commit, has a modified mirror_test.ps1 that runs IFSTest. (I will also bug the AppVeyor-people with two reproducible test cases for BSOD, in one situation the current build step will silently be skipped, in the other one the build hangs completely...)

Logs

Can be found in the AppVeyor output. A memory.dmp can be downloaded there as well.

(I am also wondering why KD is not displaying any line-numbers in the AppVeyor-output. Maybe someone can tell me how to load the debug-symbols currectly.)

& "C:\Program Files (x86)\Windows Kits\10\Debuggers\x64\kd.exe" -z $memory_dmp -y "SRV*${env:APPVEYOR_BUILD_FOLDER}\x64\Win8.1Release\*http://msdl.microsoft.com/download/symbols" -c "!analyze -v; q" }
Liryna commented 7 years ago

Hi @Rondom ,

Thank you trying to add IFSTest to appveyor 👍

Should probably change the build configuration to build as debug and not release 😢 @marinkobabic any idea what release pdb is not used ?

marinkobabic commented 7 years ago

Normally winding displays why the pub is not used. It shows if the pub has been found and the reason of not using it.

Am 26.05.2017 08:36 schrieb "Liryna" notifications@github.com:

Hi @Rondom https://github.com/rondom ,

Thank you trying to add IFSTest to appveyor 👍

Should probably change the build configuration to build as debug and not release 😢 @marinkobabic https://github.com/marinkobabic any idea what release pdb is not used ?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dokan-dev/dokany/issues/519#issuecomment-304204350, or mute the thread https://github.com/notifications/unsubscribe-auth/ADHSODbpzgfeVWM8Y5tDMlLv2yXIr6eiks5r9nMAgaJpZM4NnKxo .

Rondom commented 7 years ago

Ok, I found out how to print more information (-lines parameter). If you have suggestions for other WinDbg-commands to run automatically, feel free to tell me.

The crash seems to occur somewhere in DokanDispatchCreate. You can download a dump including the PDB and binaries in the "Artifacts"-tab on AppVeyor.

Without having much of a clue about the code I tried reverting @bailey27's two recent commits 51e690f and 756ab5b, but that did not help.

I cannot reproduce the issue on my Win 2016 machine.

Liryna commented 7 years ago

Humm by looking at the code, this could happen if relatedFileName->Length is < sizeof(WCHAR)

Can you just add a check for it on your repository to make appveyor rerun the build ?

Rondom commented 7 years ago

Could it be that the relatedFileName->Length is the buffer length in sizeof(WCHAR) instead of the number of bytes? I did the change you suggested, but it did not help. Maybe I did something wrong?

  if (relatedFileName->Length > 0 && fileObject->FileName.Length > 0 &&
      relatedFileName->Length >= sizeof(WCHAR) &&  // test
  relatedFileName->Buffer[relatedFileName->Length / sizeof(WCHAR) -
                              1] != '\\' && fileObject->FileName.Buffer[0] != ':') {
    needBackSlashAfterRelatedFile = TRUE;
    fileNameLength += sizeof(WCHAR);
  }
Liryna commented 7 years ago

@Rondom Thanks for the test. Since the BSOD is DRIVER_PAGE_FAULT_BEYOND_END_OF_ALLOCATION (read / write outside alloc memory), looks like maybe relatedFileName->Length has a wrong value ? Can you comment all the code using relatedFileName to see if he is the issue here ?

Rondom commented 7 years ago

I did as you suggested. I did not spend too much time on understanding the code, so maybe I commented the wrong things and made it worse. I am now getting a crash at a completely different place :-(

Liryna commented 7 years ago

Ok looks like there is really an incorrect FileName->Length We are exactly in the same lengh calcul that we just commented: https://ci.appveyor.com/project/Rondom/dokany/build/1.0.3-376#L1413

Rondom commented 7 years ago

So, in order to understand it better: That one crash was in create.c in DokanDispatchCreate, the new crash is in DokanNotifyReportChange0 in dokan.c:478. How are the two connected? Should I try something else?

Liryna commented 7 years ago

They both use the same file object and length of it. Looks like somewhere we set a wrong length on this objet that produces the BSOD when we use it.

I will try to look later we does this could happen.

Rondom commented 7 years ago

OK, thanks. (I am a bit tired, so I am a bit slow at understanding)

marinkobabic commented 7 years ago

Can the issue be reproduced in a separate vm when running test ifstest.exe manually? If this is the case, then we can debug and investigate the source of the problem.

Rondom commented 7 years ago

I cannot reproduce it on Win2016. That's all I can say. if you have a Win 8.1 system, you should probably (or hopefully) be able to reproduce it.

Rondom commented 7 years ago

@Liryna what do you think about running the tests also on both Win2012R2 and Win2016 on AppVeyor? For this, we would need to download the artifact from the actual build, to use it in the "FSTest" configuration, because we cannot build in AppVeyor's Win2016-images. (This means that the two will be dependant on another). Any other drawbacks, objections?

Liryna commented 7 years ago

I do not see drawbacks or objections here. I had no idea this could be possible 😮 ! Would be interesting how can this looks like in the appveyor file.

Rondom commented 7 years ago

It looks a bit clumsy. AppVeyor will build all combinations, but we exclude the ones we do not want.

  image:
    - Visual Studio 2015
    - Visual Studio 2017
  matrix:
    exclude:
      - configuration: All
        image: Visual Studio 2017
      - configuration: Coverity
        image: Visual Studio 2017

Turns out it is not so easy to re-use the artifact from the "All"-configuration. Seems like AppVeyor needs a Token-Auth for their API even if it is read-only-access from the build-worker. I think it is not worth the effort in this case, because one could not safely enable it for pull requests. That also rules out some other things, like comparing tests between the previous build and the current build and printing the delta. 😢

Rather I think we can wait for the WDK to be installed on the VS2017-image (Win10), which has VS2015 installed as well, so that we can build there for the tests as well, while keeping the compilation environment as similar as possible to avoid testing builds that are too different from one another. appveyor/ci#1554

I think that's the best we can do with the free AppVeyor-infrastucture while keeping effort reasonable.

Rondom commented 7 years ago

I have just learned that no Auth-Token is needed for publicly available information. So I extended my code to run the tests both on Win2016 and 2012R2. The result was that we get a BSOD on both OSes.

My local tests were done on Win2016 with 0e159e3 and so were the first tests with BSOD on AppVeyor.

I will submit my PR as a draft. Hopefully, it can be merged once the BSOD is fixed.

Liryna commented 7 years ago

Thank you @Rondom ! Will try to look at this BSOD to move on for your PR, Unfortunately, I will only have time in july 😢

Great work again 💘 !

Liryna commented 7 years ago

@Rondom I have pull your branch / revert marinko PR that was creating issue on it and I am not able to reproduce the bsod 😢

ifstest result.txt

For me Virus, DefragEnhancements test group was crashing ifstest, it can probably be ignored with others

I try to add some check on appveyor build see if we can avoid the bsod about file length for now...

Rondom commented 7 years ago

Have you tried running it with the same verifier settings and maybe even running the script from the same path?

Otherwise there is still the possibility to RDP into the build VM...

Liryna commented 7 years ago

👍 verifier enable make the bsod happen !

Looks like the test that make crash is DirectoryFullPathCreationTest (just guessing since I seen FileFullPathCreationTest print before. But my virtualbox do not let my windows create a dump for a reason 😢 it just freeze. @marinkobabic are you using virtualbox ? do you have the same issue ?

Expected output without verifier enabled

Test         :DirectoryFullPathCreationTest
Group        :OpenCreateGeneral
Status       :C000001E (IFSTEST_TEST_NTAPI_FAILURE_CODE)
LastNtStatus     :00000000 STATUS_SUCCESS
ExpectedNtStatus :C000000D STATUS_INVALID_PARAMETER
Description  :{Msg# OpCreatG!dfulpath!13} Attempting to create a
              directory \pmtplufd with the temporary attribute set.
              This should have failed with invalid parameter.
marinkobabic commented 7 years ago

Yes. I habe the same issue. Normally I attach the kernel debugger so it's not a huge problem.

Am 02.07.2017 21:12 schrieb "Liryna" notifications@github.com:

👍 verifier enable make the bsod happen !

Looks like the test that make crash is DirectoryFullPathCreationTest (just guessing since I seen FileFullPathCreationTest print before. But my virtualbox do not let my windows create a dump for a reason 😢 it just freeze. @marinkobabic https://github.com/marinkobabic are you using virtualbox ? do you have the same issue ?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dokan-dev/dokany/issues/519#issuecomment-312510984, or mute the thread https://github.com/notifications/unsubscribe-auth/ADHSOBV5cwRD7gk5GpKsthsjjkW5ZPCZks5sJ-uRgaJpZM4NnKxo .

Liryna commented 7 years ago

@marinkobabic Will try that ! @Rondom I found the BSOD issue, it is quite simple 😞 Can you just add to your PR:

  relatedFileName->Length = relatedFcb->FileName.Length;

Like here: https://github.com/Liryna/dokany/commit/6e59613b23247247b6683a8b2bdf69cbc0c8de4b#diff-86caa6b16e4044982075fe5d5bc59915R608

After that, all test pass without BSOD 👍

Liryna commented 7 years ago

Test currently running on my appveyor with the fix: https://ci.appveyor.com/project/Liryna/dokany/build/job/c74npkyfvs8w8ig7

Rondom commented 7 years ago

Only the assignment, i.e. without the other change in the if-statement in line 647?

Liryna commented 7 years ago

Only the assignment yes, the other line is me who revert a previous test that I did.

Rondom commented 7 years ago

Please double-check the commit, but if we get no BSOD on AppVeyor things should be fine and this issue is fixed and the PR can be merged. :raised_hands: