PointCloudLibrary / pcl

Point Cloud Library (PCL)
https://pointclouds.org/
Other
9.67k stars 4.59k forks source link

Boost::filesystem in PCL #5881

Open mvieth opened 7 months ago

mvieth commented 7 months ago

Context

Currently, PCL uses the filesystem library from Boost. In the long term, we would like to replace all functions and classes from Boost::filesystem with something else and remove this dependency. To find out where PCL uses Boost::filesystem, you can e.g. use grep -ri "boost.*filesystem" * In C++17, there is a std::filesystem library which could replace the one from Boost, but at the moment, PCL still aims to be compatible with C++14. In Boost 1.83.0, some functions from Boost::filesystem are deprecated. PCL should not use these deprecated functions any more. (see here, choose any build, select a macOS or Windows job, click Build Library, and search for filesystem)

Idea 1

Check if, instead of using functions and classes from Boost::filesystem, the same goal can be achieved in a different, C++14 compatible way (meaning, without using std::filesystem). This might for example be possible when simply checking the filename extension. In other situations, it may not be possible to find a reliable alternative on all platforms.

Idea 2

For the functions deprecated in Boost 1.83.0, that are used in PCL: check the alternative suggested by Boost. In which older Boost versions is this alternative available? PCL currently requires at least Boost 1.65.0.

Idea 3

If a function or class from Boost can not easily be replaced without using std::filesystem from C++17 (idea 1), then use std::filesystem like this:

#if (__cplusplus >= 201703L)
// TODO: new implementation using std::filesystem
#else
// existing implementation using boost::filesystem
#endif

This way, PCL stays C++14 compatible and we can automatically switch in the future.

If someone wants to work on this, please make sure that your pull requests are not too large (otherwise split them), and comment here first what you are planning to do.

SumitkumarSatpute commented 7 months ago

I would like to work on this, please do share your plan for this activity.

Thank You

mvieth commented 7 months ago

@SumitkumarSatpute Great! You could start with Idea 1. I would like to emphasize that not all functions/classes from Boost will be replaceable this way. The replacement should be C++14 compatible, work on all platforms (Windows, Linux, macOS), and not be too complex. I would suggest that you first post your suggested replacement here, and then create a pull request after one of us maintainers gives the okay

cybaol commented 7 months ago

@mvieth https://github.com/gulrak/filesystem would be OK?

mvieth commented 7 months ago

@mvieth https://github.com/gulrak/filesystem would be OK?

Interesting, thanks for the link. We would not want to have that library as a dependency, because one reason to move away from Boost::filesystem is to have fewer dependencies. But yes, in principle Idea 1 would mean doing something similar to what that library does (but only where easily possible; some functions in that library seem very long and complicated, in such cases I would rather choose std::filesystem from C++17)

cybaol commented 7 months ago

@mvieth Using experimental/filesystem? Here's a minimum example:

// main.cpp
#include <experimental/filesystem>
#include <iostream>
namespace fs = std::experimental::filesystem;
int main() {
    auto pwd = fs::current_path();
    std::cout << pwd.string() << std::endl;
    return 0;
}

And then

g++ main.cpp -o main -std=c++14 -lstdc++fs
SumitkumarSatpute commented 7 months ago

Sorry for late reply.

SumitkumarSatpute commented 7 months ago

@mvieth Using experimental/filesystem? Here's a minimum example:

// main.cpp
#include <experimental/filesystem>
#include <iostream>
namespace fs = std::experimental::filesystem;
int main() {
    auto pwd = fs::current_path();
    std::cout << pwd.string() << std::endl;
    return 0;
}

And then

g++ main.cpp -o main -std=c++14 -lstdc++fs

Doesn't it required a build change to an existing build process ?

cybaol commented 7 months ago

@mvieth Using experimental/filesystem? Here's a minimum example:

// main.cpp
#include <experimental/filesystem>
#include <iostream>
namespace fs = std::experimental::filesystem;
int main() {
    auto pwd = fs::current_path();
    std::cout << pwd.string() << std::endl;
    return 0;
}

And then

g++ main.cpp -o main -std=c++14 -lstdc++fs

Doesn't it required a build change to an existing build process ?

No

mvieth commented 7 months ago

experimental/filesystem is also not really what I had in mind. I don't have much experience with that, so I am not sure whether there is a guarantee when/which compilers offer experimental/filesystem? To give an example for "Idea 1": in some files, boost::filesystem is used to check whether the filename extension of a given file is ".pcd" or ".stl". This seems to be easily doable instead by checking the last four characters of the filename. Another example: when boost::filesystem::exists is used, there may be another, C++14 compatible way to achieve the same result. Perhaps the file is opened later anyway and we can then check whether opening it was successful.

SumitkumarSatpute commented 7 months ago
std::string getFileExtension(const std::string& filename)
{
    size_t dotPos = filename.find_last_of('.');
    if (dotPos != std::string::npos && dotPos < filename.length() - 1) 
    {
        return filename.substr(dotPos + 1);
    }
   else 
    {
        return "";
    }
}

bool hasExtension(const fs::path& filename, const std::string& extension)
{
    return filename.extension() == extension;
}

bool fileExists(const fs::path& filename) 
{
    return fs::exists(filename);
}

@mvieth Something like this ?

cybaol commented 7 months ago

@mvieth For Idea 2,I found the only two functions boost::filesystem::basename and boost::filesystem::extension used in PCL are deprecated. The alternatives are path::stem and path::extension respectively. I also found the alternatives were introduced in Boost 1.77.0 https://www.boost.org/doc/libs/1_77_0/libs/filesystem/doc/release_history.html So for now, perhaps the easiest way is to keep Boost. And the only following changes are made.

#if (BOOST_VERSION >= 107700)
// TODO: alternative implementation
#else
// existing implementation
#endif
mvieth commented 6 months ago

@SumitkumarSatpute The logic to get the filename extension looks good, but what is fs::path and fs::exists in hasExtension and fileExists?

@cybaol path::stem and path::extension seem to be available in 1.65.0 already ( https://www.boost.org/doc/libs/1_65_0/libs/filesystem/doc/reference.html#path-stem ), maybe even longer. So if we don't find a good Boost-free alternative, we can switch them without checking the Boost version (since PCL always requires at least Boost 1.65.0)

SumitkumarSatpute commented 6 months ago

@mvieth by fs::path | fs::exists I mean to use boost::filesystem

i.e

#include <boost/filesystem.hpp>

namespace fs = boost::filesystem;
cybaol commented 6 months ago

@mvieth by fs::path | fs::exists I mean to use boost::filesystem

i.e

#include <boost/filesystem.hpp>

namespace fs = boost::filesystem;

@SumitkumarSatpute It's mine.


#include <fstream>
#include <string>

bool ends_with(const std::string& str, const std::string& suffix) { const auto suffix_start = str.size() - suffix.size(); const auto result = str.find(suffix, suffix_start); return (result == suffix_start) && (result != std::string::npos); }

bool is_exists(const std::string& file_name) { std::ifstream file(file_name); return file.good(); }

mvieth commented 6 months ago

Okay, let's start with the following:

  1. Replace all occurrences of boost::filesystem::basename with path::stem (no checking for boost version needed)
  2. For all occurrences of boost::filesystem::extension(xyz) where xyz is a boost filesystem path object (e.g. it->path() or itr->path() comes up a few times), replace these with path::extension
  3. For occurrences of boost::filesystem::exists where the file is opened with std::ifstream or std::fstream afterwards, use the .good() function on the stream instead of boost's exists

Feel free to open a pull request for any of these three tasks, after commenting here which task you want to work on (one pull request per task, please).

cybaol commented 6 months ago

I am working on task 1, 2 and 3.

cybaol commented 6 months ago

@mvieth For Idea 3, in order to simplify the conditional macro in the code, a namespace alias is used when included(fs or similar). Here is my implementation.

#if (__cplusplus >= 201703L)
#include <filesystem>
namespace fs = std::filesystem;
#else
#include <boost/filesystem.hpp>
namespace fs = boost::filesystem;
#endif

And then replace all boost::filesystem with fs in the code.

mvieth commented 6 months ago

@cybaol That is a good idea, but are std::filesystem and boost::filesystem that similar when it comes to classes, methods, and their behaviour? At least the ones PCL uses? If we do this, I would suggest to use pcl_fs as the alias to avoid collisions. fs seems too popular :smile:

mvieth commented 5 months ago

I think a good next step would be to adapt https://github.com/PointCloudLibrary/pcl/blob/master/cmake/pcl_find_boost.cmake: If CMAKE_CXX_STANDARD is 17 or higher, do not list filesystem with the required boost modules, but with the optional modules instead. Also here: only link Boost::filesystem if it exists, like this:

if(TARGET Boost::filesystem)
  target_link_libraries("${LIB_NAME}" Boost::filesystem)
endif()

The behaviour we want to achieve (first three points already work, the last one does not work yet):

BillyONeal commented 5 months ago

In C++17, there is a std::filesystem library which could replace the one from Boost, but at the moment, PCL still aims to be compatible with C++14.

Do note that there are several differences between boost::filesystem and the standard one, in particular with how paths are decomposed, such as the "magic empty path" being replaced with the "magic dot". (That is /a/b/c/ decomposes to {"/", "a", "b", "c", ""} in boost but {"/", "a", "b", "c", "."} in std)

It seems likely to me that just picking between the two options "boost available, build everything using that" and "boost unavailable, try '17 when it works" is the right level of granularity. Trying to guess based on __cplusplus is scary because often different values for that are going to be present in any given program. (e.g. libA builds with C++14, libB builds with C++17, and they are linked with an executable built with C++20 is extremely common. Headers that inspect __cplusplus and make ABI decisions based on that are thus mildly doomed)