Wiiseguy / node-nde

Winamp / Nullsoft Database Engine (NDE) reader
MIT License
6 stars 0 forks source link

Can you inspect the database using just the main.dat file (without the corresponding main.idx)? #1

Closed davidlav closed 2 years ago

davidlav commented 2 years ago

Sorry, this isn't really an issue so much as a question about accessing the database in general.

I lost some music recently in a botched drive migration but was hoping to maybe see what it was that I had lost by trying to inspect my Media Library database.

Unfortunately, my Winamp database updated itself after the migration, so it no longer contains any of the songs I lost. However, there's a backup copy of main.dat from a year ago in the same directory (called something like main.dat.n3w00001A60 with a 40MB file size, so I think it holds valid data) but, alas, there's no accompanying main.idx file that got saved along with it.

Using this library I was able to read my current database just fine, but when I try to pass it the path to the old .dat file, but still using the current .idx file, unsurprisingly it doesn't work.

Do you know offhand how possible it would be to inspect that older copy of the .dat file without its associated .idx file? I don't have much experience working with streams, but I thought I might use your library as a jumping off point to see if I could rig something up and wondered if you might have any suggestions. Thanks!

Wiiseguy commented 2 years ago

I've had the same thing happen to me, but with the history.dat file. I've tried looking into extracting the data from the .dat file without an .idx, but ran into some problems. This is something I want to investigate soon, because the data should still be in there, it's just missing pointers to access them.

You could try to see if main.dat.n3w00001A60 contains the .idx file by checking if it contains the string "NDEINDEX". If you're lucky and that's the case, you could try extracting the idx file and save it as a separate file.

davidlav commented 2 years ago

Yeah, I have no need to "rebuild" the .idx file so that Winamp can use it again or anything like that. I just want to be able to scrape whatever data's in the .dat file and simply write it back out into a plain text file (csv, json, etc).

I played around with your library last night to try and get a sense for how dependent it was on the .idx file and from what I can tell it's really only grabbing an offset value for it for each row of the database, correct? (I.e. not for every column in addition?)

So in that case, would it be plausible to recover the data through "brute-force" guessing the offset at each row? I'm imagining something like wrapping the part of NdeFileData that attempts to read a row from the database (the next function maybe?) in a try/catch block, and just keep feeding it different values for the offset until it doesn't throw an error, record that offset somewhere (like in a separate file so you don't have to re-guess the offsets you've already found the next time you rerun it), and then just keep repeating that process row by row as you slowly rebuild the offsets over a period of time? Might take a while, but do you think that would be feasible? I guess it really hinges on how simple it is to determine whether you've supplied an invalid offset or a valid one. Because if it is a simple test (error/no error), I don't see a reason why you couldn't just keep guessing until you hit the jackpot, read the row, and then start hunting again for the next offset value.

Wiiseguy commented 2 years ago

Yeah, I was suggesting that maybe Winamp stuck the contents of the lost .idx file at the end of the .dat.n3w file when it did the "back-up".

I was thinking along the same lines regarding brute-forcing the read. My worry is that attributes related to a song may be strewn across the .dat file randomly. Like the filename could reside at pos 1, but the rating could be around pos 3,469,421, with no way to relate the two together without the idx. I might play around with that idea later this week. Feel free to create a branch if you want to take a stab at it.

Wiiseguy commented 2 years ago

Update: I think I've found a way to do this without an IDX file, will push changes soon!

davidlav commented 2 years ago

Awesome!

I have no idea how to go about looking for "NDEINDEX" in a binary file with Node (this is the first time in my life I've ever dealt with streams, I'm way out of my element here lol), but I did find a tool that could do it for me (I think), and there doesn't appear to be an "NDEINDEX" in any of the .dat files, only in the .idx ones. Which means it's strange that Winamp created these backups (presumably, I certainly didn't make them) without the associated index file necessary to read them.

https://github.com/stefankueng/grepWin

image

image

Wiiseguy commented 2 years ago

Ah, yeah that's what I meant, opening it in a hex editor and looking for that string.

But that's no longer necessary as I've pushed new changes to git and a new version to npm (1.1.0). I've also updated the README with a special section for an .idx-less read.

Let me know if it works!

davidlav commented 2 years ago

YOU DID IT!!! Oh my god, you have no idea how happy this makes me :-)

For some context: I was moving about 300 GB of music about five weeks ago from one drive to another so I could open up enough space to install a game of all things, and didn't realize until the other day that robocopy (which I've never had an issue with in the past) only copied over the top level directories in quite a few places, but none of the subdirectories or files themselves (it crashed a few times and I had to restart it). I have Backblaze, but naturally I didn't discover this until literally a few days after the 30-day history threshold had passed (needless to say, I'm now shelling out an extra $2/month for a full year of history now). So I knew at a very vague level what I had lost, but since in some cases it only moved over empty top level directories, in some cases it was a specific album, or a whole artist (easy enough to guess what was there), but in others it was a whole genre and I really had no idea what I had even lost.

So I started to think if there was any other place on my machine where an "echo" or "shadow" of what I lost still remained, until I finally figured that Winamp must store the result of its scan in a file somewhere, and low and behold it had made that backed up copy like two years ago (because I had rescanned my drive after the migration and overwritten what had been there). And then immediately saw that the index file was missing and nearly broke down because I felt that I was soooo close. It was there, I just couldn't read it. So then I started posting to the official Winamp forum, the Winamp subreddit, and finally found this library and started digging into it with my non-existent knowledge of streams or binary files to see what I could rig up and here you've delivered my salvation.

As a fellow Winamp user, I probably don't need to explain the pain I felt when I lost that music and what it means to me to at least now know what it was that I lost so that I can start reassembling it again. I felt like I'd almost lost a piece of myself and just want to convey how grateful I am to you for writing this and helping me. I owe you a drink! Cheers!

Wiiseguy commented 2 years ago

You're welcome! I'm sorry you lost all that music, but I'm glad you have a starting point on getting it back now. Funnily, when looking for the NDE specification docs, I found this post, was that you? 😋

I think this library now has a unique feature, as all the ones I've come across require an index file. If you don't mind, could you please make an update to the posts you made on the forum and subreddit? I think it may help out other folks with the same problem when they Google this in the future.

Thank you!

davidlav commented 2 years ago

Yes it was! And yep, I'll write a follow-up pointing people here. Thanks again!

Wiiseguy commented 2 years ago

Awesome, cheers!