OutpostUniverse / OP2Utility

C++ library for working with Outpost 2 related files and tasks.
MIT License
4 stars 0 forks source link

Make XFile::GetFilenamesFromDirectory reject nonFilenames #255

Closed Brett208 closed 5 years ago

Brett208 commented 5 years ago

Might be good to update unit tests. Ran out of time at least for now though.

-Brett

Brett208 commented 5 years ago

Unit test added.

Brett208 commented 5 years ago

This change is meant as a bugfix. I designed GetFilenamesFromDirectory to only return files which I thought this was fairly clear by the name. By returning directories, it is causing problems for OP2Archive when sub-directories exist in a directory marked for packing.

If we want to change the scope, the name should change to something else. Perhaps GetItemsFromDirectory.

What you are proposing sounds like a fairly significant change. Perhaps it should be fleshed out in an issue. Also, I believe the filesystem on Linux can contain links, sockets, and named pipes which should be discussed if opening to beyond just files.

I agree with the posted article that range based operations in C++ STL are clunky. (But then again, what isn't clunky in the language?) I would recommend learning about LINQ as well before implementing anything. It supports performing operations on data in general, although I typically apply it to containers like C#'s list or C# array. It also supports deffered operations. This has interesting consequences such as allowing definitions of several operations to perform before it is applied which could save computation cycles (but causes logic headaches just like threading).

https://docs.microsoft.com/en-us/dotnet/csharp/programming-guide/concepts/linq/

I don't have any issue with providing range based operations for XFile, although it sounds more complex to implement than I was looking to dedicate time to.

-Brett

DanRStevens commented 5 years ago

I wouldn't go so far as to suggest we implement range based directory iteration right now. More like use it as a sense of inspiration for where we might go in the future. I list it here as it's related, not because I propose doing it.

The fundamentals behind LINQ sounds like it may be similar to the concept of Ranges and Filters. Though I must admit, I've never been much of a fan of the SQL-like syntax.


I suppose I consider "Filename" to include directory names (as well as special files, symbolic links, and all the rest). I'm not a fan of "GetItemsFromDirectory". As addressed in the Clean Code Collection, "Item" makes for a terrible name.


In terms of the changes here, I feel this is an application level concern, which is getting rolled into a library method. I'm pretty sure we'll regret doing that later. I think the real fix should go into OP2Archive, not in OP2Utility.

Brett208 commented 5 years ago

I would not lump filename to include directories or other special items that could be found in directories. Perhaps GetDirectoryContents.

This contributes to bugs in OP2MapImager. test.map is a legitimate directory name on Windows, which would be returned as a map filename by the overload std::vector<std::string> GetFilenamesFromDirectory(const std::string& directory, const std::string& fileType); It is hard for me to understand how pushing a fileType of .map can return a directory and not be considered a bug. (Maybe fileType here could be renamed to fileExtension in long run).

DanRStevens commented 5 years ago

Ehh, GetDirectoryContents doesn't sound very intuitive. A programmer probably wouldn't know to use that name unless they read the documentation. Even with IntelliSense, I'd still feel the need to read documentation before using it.

I think one of the problems here is we'd be fighting precedence. I'm much more used to the concept of directory iteration returning everything by default, with results needing to be filtered if you want things like directories removed.

As for the bugs in OP2MapImager, I'd say they are bugs in OP2MapImager, not in OP2Utility.


You're right "test.map" is a valid name for a directory. There is no restriction that extensions only apply to files. Additionally, Linux doesn't even have a concept of extensions. At least, in terms of deciding what program to use to open a file, Linux doesn't consider the file extension. It has other methods for determining what program to use.

It's not valid to assume a name with an extension necessarily represents a file, nor that it necessarily does not represent a directory.

Brett208 commented 5 years ago

This bugfix still needs to be merged. Still looking to release new versions OP2Archive and OP2MapImager which depends on this fix.

Happy to see XFile updated in the future to allow some means of returning subdirectory names as some other effort.

DanRStevens commented 5 years ago

You're right, this has been open a while. We should deal with it.

I'm still against this change though. We shouldn't push application level concerns into the library. The fact that you need only filenames, and not directory names, is a concern that is specific to your particular use case in that particular calling project. Your suggestion here is to effectively hardcode an application specific data filter into a generic library, into the sole method for retrieving directory contents, to solve the problem in your calling application.

In general, this kind of filtering should be done in the calling project, not in the library.


As the GetFilenamesFromDirectory returns a vector of filenames, it could be filtered after the fact:

    auto files = XFile::GetFilenamesFromDirectory("");
    files.erase(
        std::remove_if(
            files.begin(),
            files.end(),
            [](std::string path){ return !XFile::IsFile(path); }
        ),
        files.end()
    );

If this is a common data filtering need, it's reasonable to put support into the library, through additional methods:

std::vector<std::string> SelectOnlyFiles(std::vector<std::string> paths) {
    paths.erase(
        std::remove_if(
            paths.begin(),
            paths.end(),
            [](std::string path){ return !XFile::IsFile(path); }
        ),
        paths.end()
    );
    return paths;
}

Usage:

    auto files = SelectOnlyFiles(XFile::GetFilenamesFromDirectory(""));
Brett208 commented 5 years ago

This PR is to fix a bug where a library function does not return the intended results. The current design of the function GetFilesFromDirectory (renamed to GetFilenamesFromDirectory) and its overloads is to only return filenames, but due to an oversight on my part it happens to return filenames, directories, and other directory items. OP2Archive and OP2MapImager are already using GetFilenamesFromDirectory as intended.

DanRStevens commented 5 years ago

Yes, and I have code that depends on the actual behavior.

If you want to "correct" the method, please don't remove the existing behavior. I can understand if you want to rename things, perhaps lay claim to the original intended name, but please leave the original functionality under a new name.

Perhaps rename the original to GetDirectoryContents. Though that doesn't quite follow the same pattern as the other name. Some adjustment would be needed.

Brett208 commented 5 years ago

Other PRs have solved the issue in a different manner. Closing.

DanRStevens commented 5 years ago

Tagging PRs #291, #292, #293. I believe those are the related PRs.