Scanner is a new struct that allows opening a file system directory, and lazily enumerating its contents using a similar API to bufio.Scanner allows for enumeration of lines in a io.Reader.

Reader has been updated to use the new Scanner struct for reading entire directory contents. The API is not changed, but the provided scratch buffer to ReadDirents and ReadDirnames is ignored. All of the Windows vs non-Windows code has been moved to the Scanner struct. The ReadDirents and ReadDirnames code is now the same for all operating systems.

Walk has been updated to use the new Scanner struct for lazily reading directory contents. The API is not changed, other than the Config.ScratchBuffer field is ignored, and its documentation is also updated. Some details of Walk's use of Scanner is warranted. If Walk is told to enumerate the directory contents unsorted, then it will use the Scanner to lazily read and process each child entry in the directory. If Walk is not given the ability to enumerate unsorted, it will read the contents of the entire directory, building up a slice of Dirent, sorting them, then enumerating them. This is important because how Walk is required to stop processing if it receives a special filepath.SkipDir error return value from visiting a node. In order to ensure determinate results when that happens, and the same files are visited each time, the directory must be enumerated in sorted order. Unless of course the caller specifically requested unsorted enumeration.

Advantages

Reading smaller chunks of the file system into memory even for directories with very many child entries, and entries with long file system names.
Because reading smaller chunks of the directory's contents into memory, prevents allocating working buffer larger than a page size. One caveat to this is the risk if a single file system entry requires more than a page size to fit its contents. Which is only possible if the file system entry's name is longer than the page size, which is unlikely on modern file systems.
On operating systems other than Windows, only looks up child file system entry file mode types when requested, but does so with the data already pulled from the file system, to prevent another file system request. This means that when the entry's mode type is not requested by the caller, the data is never even computed from the entry's contents.

Disadvantages

As mentioned above, because the ReadDirents and ReadDirnames API was already set, while they no longer use the provided scratch buffers provided, the API needs to remain the same, so the parameters remain in place, possibly causing confusion. Source code documentation has been updated to reflect the fact that the parameters are now ignored.
The Config.ScratchBuffer config parameter is ignored, possibly causing confusion, but its documentation reflects its deprecated status. The parameter was not removed to prevent a backwards incompatible API change from being introduced.

Closes #34

karrick / godirwalk

Scanner struct allows fast and lazy enumeration of directory contents #38

Advantages

Disadvantages