cgsecurity / testdisk

TestDisk & PhotoRec
https://www.cgsecurity.org/
GNU General Public License v2.0
1.52k stars 189 forks source link

NetCDF data recovery #148

Closed cdelv closed 4 months ago

cdelv commented 5 months ago

Hi, I'm trying to recover NetCDF (.nc) files using Photorec. However, I have 1 issue.

The test for whether a file is a NetCDF file is just a comparison of the first four bytes of the file with the netCDF "magic number", which is the bytes 'C', 'D', 'F', SOH (Start Of Header or ASCII control-A, used for the version number of the netCDF format which is still version 1). If you use the Unix "od" command to look at the beginning of a netCDF file as characters with "od -c", you should always see the following as the first four bytes:

% od -c foo.nc
0000000   C   D   F 001  ...

So I added to the .sig file the following

nc 0 "CDF"

After some testing, fidentity was able to detect many different .nc files. The problem is that I was not able to recover anything, and I'm wondering if I just had bad luck or if I have to do something else to recover binary files.

I would appreciate it if someone could help me. Best regards.

cgsecurity commented 5 months ago

Your signature is correct. PhotoRec should be able to recover files using it. To reduce the number of false positives, you can use instead nc 0 "CDF" 0x01

cdelv commented 5 months ago

@cgsecurity, thank you for the response. After testing in another drive I got a ton of false positives. However I noticed this pattern:

hexdump -C wrfout_d01_2018-01-01.nc | head
00000000  43 44 46 02 00 00 00 04  00 00 00 0a 00 00 00 09  |CDF.............|
00000010  00 00 00 04 54 69 6d 65  00 00 00 00 00 00 00 0a  |....Time........|
00000020  44 61 74 65 53 74 72 4c  65 6e 00 00 00 00 00 13  |DateStrLen......|
00000030  00 00 00 0b 73 6f 75 74  68 5f 6e 6f 72 74 68 00  |....south_north.|
00000040  00 00 00 91 00 00 00 09  77 65 73 74 5f 65 61 73  |........west_eas|
00000050  74 00 00 00 00 00 01 38  00 00 00 0a 62 6f 74 74  |t......8....bott|
00000060  6f 6d 5f 74 6f 70 00 00  00 00 00 1d 00 00 00 0f  |om_top..........|
00000070  62 6f 74 74 6f 6d 5f 74  6f 70 5f 73 74 61 67 00  |bottom_top_stag.|
00000080  00 00 00 1e 00 00 00 10  73 6f 69 6c 5f 6c 61 79  |........soil_lay|
00000090  65 72 73 5f 73 74 61 67  00 00 00 04 00 00 00 0e  |ers_stag........|

Is there a way to add the rest of the header info?

cgsecurity commented 5 months ago

You can use several signatures, here for the 32-bit (CDF v1) and the 64-bit (CDF v2) versions nc 0 "CDF" 0x01 nc 0 "CDF" 0x02

If there are still too may false positives, you can try nc 0 "CDF" 0x01 0x00 0x00 nc 0 "CDF" 0x02 0x00 0x00

Good luck

cdelv commented 5 months ago

Thank you