madsmtm / objc2

Bindings to Apple's frameworks in Rust
https://docs.rs/objc2/
MIT License
363 stars 40 forks source link

Use `clang`'s native modules for header generation #448

Open madsmtm opened 1 year ago

madsmtm commented 1 year ago

clang has functionality for loading, which is in part how Swift gets their nice import Foundation.NSString.

We should use that in header-translator too, especially for feature-gating things (instead of feature-gating based on class name).

It can be enabled using something like:

-fmodules -fapinotes-modules -fmodules-cache-path=./target

Though after doing that, I'm having trouble with getting libclang to give me a cursor to each module, so that I can check what's in it!

Other possibly interesting options: -Xclang -fmodule-format=raw, -fmodule-feature=objc,

madsmtm commented 1 year ago

I think this is going to be yet another case of "libclang does not expose enough", and the path forwards is probably https://github.com/madsmtm/objc2/issues/345

madsmtm commented 1 year ago

A huge gain from this would be in compile-times: We could feature-flag things depending on module name instead of the current, clunky class name!

Though reconsidering, we can maybe already do this? I know that the module.modulemap file can change this, and we should handle that in the future, but most frameworks just have the behaviour "export each separate header as a submodule".

So maybe we can just use filenames in the first iteration?

Concretely, statements would require the following cfg attributes:

Many of these are similar to what we've already implemented, although I suspect that if we drop the "classes have their superclasses transitively enabled", it will be harder to gather the list of required features elsewhere (and it will be more confusing in docs). Though again, this will already be a problem with all the other types, so solving it for classes will not be enough.

silvanshade commented 1 year ago

clang has functionality for loading, which is in part how Swift gets their nice import Foundation.NSString.

We should use that in header-translator too, especially for feature-gating things (instead of feature-gating based on class name).

[...]

Though after doing that, I'm having trouble with getting libclang to give me a cursor to each module, so that I can check what's in it!

[...]

I think this is going to be yet another case of "libclang does not expose enough", and the path forwards is probably https://github.com/madsmtm/objc2/issues/345

@madsmtm This sounds like a good idea going forward and indeed seems related to #345.

Just to give an update on my last comment in that issue, I've been working on a new rewrite of the bindings I initially made for ClangImporter, now using a more efficient combination of autocxx and hand-written cxx bindings.

It's still in a private repo but I plan to replace the contents of silvanshade/framework-translator with it soon (though I will probably rename it to clang-importer-cxx).

In any case, the bindings are much more complete now and I'm nearly to the point to where it's possible to query modules and headers for declarations and then be able walk through the AST nodes.

Since the ClangImporter machinery seems heavily oriented around clang modules, I guess it makes sense that it ties into this approach your suggesting.

The current status of the bindings are that it's possible to create an ASTContext and ClangImporter instance and work with most of the basic related LLVM and clang data structures and support machinery.

I'm currently working on adding enough remaining bindings from swift/include/swift/AST to be able to query the ASTContext and ClangImporter for useful information about modules and declarations. I think I should have something working for that by the end of next week judging by the current rate of progress.

I was planning on finishing that up and then cleaning up the build process stuff before making the new version public but since this may be relevant for your ideas here, I can try to just get the current repo updated with what I've got sooner and then just worry about finishing up the other stuff later.

Just to give an example of how using ClangImporter via those bindings will look, here is a unit test for ClangImporter (from the swift repo here) translated to Rust:

/// Create a temporary cache on disk and clean it up at the end.
#[test]
fn cache() -> BoxResult<()> {
    let temp = tempfile::tempdir()?;

    // Initialize default clang importer options.
    moveit!(let mut clang_importer_options = unsafe { swift::ClangImporterOptions::new() });

    // Create a cache subdirectory for the modules and PCH.
    let cache = temp.path().join("cache");
    std::fs::create_dir(&cache)?;
    unsafe { clang_importer_options.as_mut().set_module_cache_path(&cache) };
    unsafe { clang_importer_options.as_mut().set_precompiled_header_output_dir(&cache) };

    // Create the includes.
    let include = temp.path().join("include");
    std::fs::create_dir(&include)?;

    let module_dot_modulemap = include.join("module.modulemap");
    let a_dot_h = include.join("A.h");
    let bridging_dot_h = include.join("bridging.h");

    unsafe { clang_importer_options.as_mut().modify_extra_args_push_back("-nosysteminc") };
    {
        let include = include.as_os_str().to_str().expect("path should be a valid UTF-8 string");
        unsafe { clang_importer_options.as_mut().modify_extra_args_push_back(&format!("-I{include}")) };
    }
    {
        std::fs::write(&module_dot_modulemap, indoc! {r#"
            module A {
                header "A.h"
            }
        "#})?;
        std::fs::write(&a_dot_h, indoc! {r#"
            int foo(void);
        "#})?;
        std::fs::write(&bridging_dot_h, indoc! {r#"
            #import <A.h>
        "#})?;
    }

    // Create a bridging header.
    unsafe { clang_importer_options.as_mut().set_bridging_header(&bridging_dot_h) };

    // Set up the importer.
    moveit!(let mut lang_options = unsafe { swift::LangOptions::new() });

    let (os_was_invalid, arch_was_invalid) = {
        let arch = llvm::Twine::from("x86_64");
        let vendor = llvm::Twine::from("apple");
        let os = llvm::Twine::from("darwin");
        moveit!(let target = unsafe { llvm::Triple::new(&arch, &vendor, &os) });
        unsafe { lang_options.as_mut().set_target(target) }
    };
    if os_was_invalid {
        return Err("invalid os".into());
    }
    if arch_was_invalid {
        return Err("invalid arch".into());
    }

    moveit!(let mut sil_options = unsafe { swift::SilOptions::new() });
    moveit!(let mut type_checker_options = unsafe { swift::TypeCheckerOptions::new() });

    unsafe { llvm::initialize_llvm() };

    moveit!(let mut search_path_options = unsafe { swift::SearchPathOptions::new() });
    moveit!(let mut symbol_graph_options = unsafe { swift::symbolgraphgen::SymbolGraphOptions::new() });
    moveit!(let mut source_manager = unsafe { swift::SourceManager::new() });
    moveit!(let mut diagnostic_engine = unsafe { swift::DiagnosticEngine::construct_from_source_manager(source_manager.as_mut()) });

    let mut ast_context = unsafe {
        swift::AstContext::get(
            lang_options.as_mut(),
            type_checker_options.as_mut(),
            sil_options.as_mut(),
            search_path_options.as_mut(),
            clang_importer_options.as_mut(),
            symbol_graph_options.as_mut(),
            diagnostic_engine.as_mut(),
            |_module_name, _is_overlay| true,
        )
    };

    let swift_pch_hash = None;
    let dependency_tracker = None;
    let mut clang_importer =
        unsafe { swift::ClangImporter::create(ast_context.pin_mut(), swift_pch_hash, dependency_tracker) };

    let pch = cache.join("bridging.h.pch");
    std::fs::File::create(&pch)?;

    // Emit a bridging PCH and check that we can read the PCH.
    assert!(!unsafe { clang_importer.pin_mut().can_read_pch(&pch) });
    assert!(!unsafe { clang_importer.pin_mut().emit_bridging_pch(&bridging_dot_h, &pch) });
    assert!(unsafe { clang_importer.pin_mut().can_read_pch(&pch) });

    // Overwrite the PCH with garbage.  We should still be able to read it from the in-memory cache.
    std::fs::write(&pch, "garbage")?;
    assert!(unsafe { clang_importer.pin_mut().can_read_pch(&pch) });

    temp.close()?;
    Ok(())
}