defuse / crackstation-hashdb

CrackStation.net's Lookup Table Implementation.
GNU General Public License v3.0
355 stars 105 forks source link

Can't sort large word list #20

Open YarNix opened 2 weeks ago

YarNix commented 2 weeks ago

This is not the fault of the program but rather stdio.c In sortidx.c,

  int sortFile(FILE *file, struct IndexEntry *sortBuffer, int64_t bufcount) {
  fseek(file, 0L, SEEK_END);
  int64_t size = ftell(file);
  if(size % INDEX_ENTRY_WIDTH != 0) {
      return 1;
  }
  /* the rest of the function... */
}

ftell returns a long which is equivalent to int32_t, meaning that if the file size is larger than 2gb ftell would fail

Solution

Create a failsafe that read the whole file and count the bytes --or-- Use a native way to get file size On linux fstat() On window GetFileSize(), converting C file descriptor to window's handle you can use (HANDLE)_get_osfhandle(fileno(file))

YarNix commented 2 weeks ago

This seem to work

#include <sys/stat.h>
#ifdef _WIN32
#define stat64 _stat64
#define fstat64 _fstat64
#endif
int sortFile(FILE *file, struct IndexEntry *sortBuffer, int64_t bufcount)
{
    if(fseek(file, 0L, SEEK_END) != 0) {
        printf("ERROR: fseek() failed!");
        return 1;
    }
    int64_t size = ftell(file);
    if(size == -1) {
        printf("ERROR: ftell() failed. %s. Using OS alternative instead.\n", strerror(errno));
        struct stat64 st;
        if (fstat64(fileno(file), &st) != 0) {
            // Nothing we can do
            printf("ERROR: Failed to determine file size. %s\n", strerror(errno));
            return 1; 
        }
        size = st.st_size;
    }
    // function continue...
}