OSGeo / PROJ

PROJ - Cartographic Projections and Coordinate Transformations Library
https://proj.org
Other
1.73k stars 783 forks source link

Things needed or wanted by Kurt for a Bazel based build #2998

Open schwehr opened 2 years ago

schwehr commented 2 years ago

Assigned to myself since I intend to slowly work through this list.

The things here may benefit other users of PROJ, but this issue is primarily for me to document the things I need to figure out to do a clean bazel build of PROJ. I do not intend to commit the bazel build and my aim is to only make change to PROJ that are reasonable to upstream. Some of the stuff hear may end up being local patches for my build environment. Hopefully many of them will be just generally good cleanup. I don't intend to discuss the details of the following list on this issue. If something is worth discussing, it probably deserves a separate issue. Some will probably just be infeasible

And probably more.

schwehr commented 2 years ago

For compiling in the SQLite Proj db, it looks like making a small read-only from a string SQLite VFS driver will do the trick.

https://unix.stackexchange.com/questions/176111/how-to-dump-a-binary-file-as-a-c-c-string-literal

I gave xxd --include a try and it works pretty well. The output needs a bit of tweaking to set the array length and make it const, but that's pretty easy. I've used another solution in the past for loading stuff into GDAL's vsimem, but the code that builds the string isn't open source, so it's a no go. But making a small program to build a string literal isn't particularly difficult.

rouault commented 2 years ago

I gave xxd --include a try and it works pretty well

There's a tip in https://discourse.cmake.org/t/support-for-embedding-data-in-a-manner-equivalent-to-xxd/2121/8 to make it CMake portable. One thing I somehow remember to have read in the past is that compilers might have limitations on the maximum size of a static array, or maybe this was just for strings? (MSVC has a limit to ~ 2048 bytes: https://docs.microsoft.com/en-us/cpp/c-language/maximum-string-length?view=msvc-170), (the size of a read-only section in most object files should hardly be limited, but the C/C++ compiler frontend might limit it, hence the use of objcopy mentionned in the links you quoted), but I can't find references. In any case, this functionality could be restricted to a limited set of white listed compilers.

schwehr commented 2 years ago

@rouault Thanks for the follow up. I think the objcopy option is unlikely to work well for my bazel env. I see that MSVC has a similar limit for C++: https://docs.microsoft.com/en-us/cpp/cpp/string-and-character-literals-cpp. It looks like I'll have to do an array of string literals or something similar if I do a solution that will also work for MSVC. xxd isn't really required. I can make a python program that builds what we need exactly so there isn't hackery after the initial command.

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

schwehr commented 2 years ago

Hoping to work on this more soon. I think have rounded up some help.

schwehr commented 1 year ago

Internally at Google, we are using our local only system for embedding the data in C++. The parts I can share are:

void DatabaseContext::Private::open(const std::string &databasePath,
                                    PJ_CONTEXT *ctx) {
    setPjCtxt(ctx ? ctx : pj_get_default_ctx());
    std::string path(databasePath);
    if (path.empty()) {
        // BEGIN GOOGLE MODIFICATION
        static const FileToc *toc = [] {
          // Enable the memvfs extension.
          const int init_code = sqlite3_memvfs_init(nullptr, nullptr, nullptr);
          if (init_code != SQLITE_OK_LOAD_PERMANENTLY) {
            throw FactoryException(
                absl::StrCat("Can't initialize memvfs, with code ", init_code));
          }
          // Re-register 'unix' as default filesystem. MemVFS can be used
          // through URL parameters, or by passing as the fourth argument to
          // sqlite3_open_v2().
          const int register_code =
              sqlite3_vfs_register(sqlite3_vfs_find("unix"), /*makeDflt =*/1);
          if (register_code != SQLITE_OK) {
            throw FactoryException(absl::StrCat(
                "Can't re-register the default VFS, with code ", init_code));
          }
          return proj_db_create();
        }();
        // Load the database from a cc_embed_file, using the memvfs
        // extension of sqlite.
        path = absl::StrFormat("file:/proj?ptr=0x%x&sz=%d&max=%d&vfs=memvfs",
                               reinterpret_cast<uintptr_t>(toc[0].data),
                               toc[0].size, toc[0].size);
        // END GOOGLE MODIFICATION
    }
rouault commented 1 year ago

Regarding proj.db embedding, I see https://github.com/duckdblabs/duckdb_spatial/blob/main/spatial/src/spatial/proj/module.cpp has a fairly similar approach to https://github.com/OSGeo/PROJ/issues/2998#issuecomment-1435043655

rouault commented 3 weeks ago
  • Is it possible to have a clean way to optionally compile in the proj.db so that no separate file has to exist? e.g. have sqlite directory load an in memory db from a .data block?

cf https://github.com/OSGeo/gdal/pull/10913

schwehr commented 5 days ago

https://github.com/OSGeo/gdal/pull/10913 will greatly simplify things for me. Woohoo!

And I need to go back through this issue and see where things are as it has been a long time.

rouault commented 5 days ago

cf OSGeo/gdal#10913

actually this note dates back before the split between the GDAL and PROJ RFC. The PROJ relevant part is https://github.com/OSGeo/PROJ/pull/4274