Open alexcrichton opened 6 years ago
For ArchiveWrapper.cpp
a fun project could also be trying to use a crate on crates.io instead!
What's the status with LLVM upstream these days? How compatible is Rust with upstream LLVM?
We track upstream pretty closely, but all of the above bindings occasionally require updates to make sure we compile against upstream. To that end it'd still be best to upstream this!
cc @petrhosek
we were talking about this at rustconf, and while dated it still has some good information I think!
I'm interested in working on this. Besides the benefit to Rust, I also think LLVM benefits from having a more complete C API. Since this has been open for a while: Are there any parts of this that are no longer relevant and I can avoid spending time on?
Any progress on this? Downstream rust packagers will also greatly benefit from this by the means of much shorter build times.
Why would it result in much shorter build times? These shims are small compared to the rest of rustc and LLVM and work with precompiled LLVM distributions too, not just the rust fork of LLVM.
Oh, sorry, I thought it is a blocker for using precompiled LLVM.
Visiting for T-compiler backlog bonanza.
Seems like a good list (if long) of relatively simple tasks.
@rustbot label: S-tracking-impl-incomplete
In general we try to use the LLVM C API whenever we can as it's generally nice and stable. It also has the great benefit of being maintained by LLVM so it tends to never be a pain point when upgrading LLVM! Unfortunately though LLVM's C API isn't 100% comprehensive and we often need functionality above and beyond what you can do with just C.
For this custom functionality we typically use the C++ API of LLVM and compile our own shims which then in turn have a C API. At the time of this writing all of the C++ to C shims are located in the
src/rustllvm
directory across three main files:ArchiveWrapper.cpp
,PassWrapper.cpp
, andRustWrapper.cpp
. These files are all compiled viabuild.rs
around here where basically usellvm-config
to guide us in how to compile those files.The downside of these shims that we have, however, is that they're difficult for us to maintain over time. They impose problems whenever we upgrade LLVM (we have to get them compiling again as the C++ APIs change quite regularly). Additionally it also makes consumers of Rust have a more difficult time using custom LLVM versions. For example right now our shims compile on LLVM 5 but probably not LLVM trunk. Additionally for users that like to follow LLVM trunk then keeping up with the breakage of our shims can be quite difficult!
To help solve this problem it seems the ideal solution is to try to upstream at least a big chunk of the C++ APIs that we're using. This way we can much more closely stick to LLVM's C API which is far more stable. It makes it that much easier for us to eventually upgrade LLVM and it makes users using a custom LLVM not need to worry about using an LLVM beyond the one that we're using (aka LLVM trunk).
I'll try to have a checklist here we can maintain over time which also is a good listing of what each of the APIs does!
ArchiveWrapper.cpp
In general this is functionality for reading archive
*.a
files in the Rust compiler. This makes reading rlibs (which are archive files) extra speedy. The functions here are:LLVMRustOpenArchive
LLVMRustDestroyArchive
LLVMRustArchiveIteratorNew
LLVMRustArchiveIteratorFree
LLVMRustArchiveIteratorNext
LLVMRustArchiveChildName
LLVMRustArchiveChildData
LLVMRustArchiveChildFree
LLVMRustArchiveMemberNew
LLVMRustArchiveMemberFree
LLVMRustWriteArchive
These functions are basically just reading and writing archives, using iterators for reading and providing a list of structs for writing.
PassWrapper.cpp
This file is when we get into a bit more of a smorgasboard of random functions rather than a consistent theme, so I'll comment more of them inline below.
A general theme here I've found as I wrote these down is that it's not critical that all of these are implemented. I could imagine that it would be possible to have a mode where we as rustc still compile shims sometimes (like the ones below) but many of the shims are stubbed out to not actually use LLVM at all if we're in "non-Rust-LLVM mode" (aka custom LLVM mode). In other words, we don't necessarily need to upstream 100% of these functions.
LLVMInitializePasses
- not entirely sure why we can't use the upstream versions. Someone more knowledgeable with LLVM may know how to replace this!LLVMRustFindAndCreatePass
- this is how we add custom passes to a pass manager by their string nameLLVMRustPassKind
- categorizes whether a pass is a function or module passLLVMRustAddPass
- add a custom pass to a pass managerLLVMRustPassManagerBuilderPopulateThinLTOPassManager
- thin wrapper around the C++ API to populate a ThinLTO pass managerLLVMRustHasFeature
- this is actually a pretty tricky one. It has to do with https://github.com/rust-lang/rust/issues/46181 and is I think the only function which actually only works with our fork. I can provide more information for this if necessary.LLVMRustPrintTargetCPUs
- mostly just a debugging helper we could stub out in the custom LLVM case.LLVMRustPrintTargetFeatures
- same as aboveLLVMRustCreateTargetMachine
- this is one we have to create aTargetMachineRef
ourselves but also giving us full access to all the fields, would probably just involve exposing more field accessors and setters and such.LLVMRustDisposeTargetMachine
- complement to the aboveLLVMRustAddAnalysisPasses
- I think this is just adding "standard" passes to the pass manager IIRC, we're just trying to mirror what clang is doing here.LLVMRustConfigurePassManagerBuilder
- just configuring some fields, again also aimed at mirroring clang.LLVMRustAddBuilderLibraryInfo
- again, attempting to mirror clang by configuring all the fieldsLLVMRustAddLibraryInfo
- mirroring clangLLVMRustRunFunctionPassManager
- seems ripe to add upstream!LLVMRustSetLLVMOptions
- I think this is for one-time configuration of LLVM at startupLLVMRustWriteOutputFile
- there's a whole bunch of ways to write outupt files with LLVM, if we had something that just wrote it out to memory or a file that'd be good enough for usLLVMRustPrintModule
- I'm pretty sure this is mainly just generating IR, but I'm not personally too familiar with the need for a custom class hereLLVMRustPrintPasses
- AFAIK a debugging helper, could be stubbed out with a custom LLVMLLVMRustAddAlwaysInlinePass
- may just be missing upstream?LLVMRustRunRestrictionPass
- I think this is part of our LTO bindings, internalizing lots of stuffLLVMRustMarkAllFunctionsNounwind
- definitely part of our LTO bindings, for when you're compiling with-C lto
and-C panic=abort
LLVMRustSetDataLayoutFromTargetMachine
- not entirely sure what this is...LLVMRustGetModuleDataLayout
- also not entirely sure what this is...LLVMRustSetModulePIELevel
- I think just configuring more propertiesLLVMRustThinLTOAvailable
- for us just testing the LLVM version right nowLLVMRustWriteThinBitcodeToFile
- mostly just what it says on the tinLLVMRustThinLTOBufferCreate
- same as abvoe but in memoryLLVMRustThinLTOBufferFree
- freeing the aboveLLVMRustThinLTOBufferPtr
- reading the aboveLLVMRustThinLTOBufferLen
- reading the aboveLLVMRustParseBitcodeForThinLTO
- mostly what it says on the tinThese APIs are all related to ThinLTO are are still somewhat in flux, there may not be a great C API just yet.
LLVMRustCreateThinLTOData
LLVMRustFreeThinLTOData
LLVMRustPrepareThinLTORename
LLVMRustPrepareThinLTOResolveWeak
LLVMRustPrepareThinLTOInternalize
LLVMRustPrepareThinLTOImport
RustWrapper.cpp
Sort of even a bigger smorgasboard than
PassWrapper.cpp
! Note that many of these functions are very old and may have actually made their way into the C API of LLVM by now, in which case that'd be awesome!LLVMRustCreateMemoryBufferWithContentsOfFile
- this is something we can and probably should write ourselves rather than relying on LLVMLLVMRustGetLastError
- this is a Rust-specific API for getting out an error message, I'd imagine that whenever it's set we'd have something analagous in LLVM.LLVMRustSetLastError
- used by the C++ code to set the error that rustc will retrieve laterLLVMRustSetNormalizedTarget
- I think this is just exposing something that wasn't already there.LLVMRustPrintPassTimings
- debugging on our end.LLVMRustGetNamedValue
- I think this is just fun dealing with metadataLLVMRustGetOrInsertFunction
- needed that C++ function most likely.LLVMRustGetOrInsertGlobal
- again, probably just needed the functionLLVMRustMetadataTypeInContext
- more constructors for more typesLLVMRustAddCallSiteAttribute
- just a "fluff" thing we needed to do that wasn't possible in C IIRCLLVMRustAddAlignmentCallSiteAttr
- same as aboveLLVMRustAddDereferenceableCallSiteAttr
- same as aboveLLVMRustAddDereferenceableOrNullCallSiteAttr
- same as aboveLLVMRustAddFunctionAttribute
- same as aboveLLVMRustAddAlignmentAttr
- same as aboveLLVMRustAddDereferenceableAttr
- same as aboveLLVMRustAddDereferenceableOrNullAttr
- same as aboveLLVMRustAddFunctionAttrStringValue
- same as aboveLLVMRustRemoveFunctionAttributes
- same as aboveLLVMRustSetHasUnsafeAlgebra
- not entirely sure what this is doing...LLVMRustBuildAtomicLoad
- I think at the time the C API didn't exist?LLVMRustBuildAtomicStore
- same as aboveLLVMRustBuildAtomicCmpXchg
- same as aboveLLVMRustBuildAtomicFence
- same as aboveLLVMRustSetDebug
- I think one-time configuration of LLVMLLVMRustInlineAsm
- I think the C API didn't exist (or wasn't full-featured enough)LLVMRustAppendModuleInlineAsm
- that function probably wasn't exposed in CLLVMRustVersionMinor
- just exposing a constantLLVMRustVersionMajor
- same as aboveLLVMRustDebugMetadataVersion
- this and most debug functions below I think just aren't in the C APILLVMRustAddModuleFlag
- same as aboveLLVMRustMetadataAsValue
- same as aboveLLVMRustDI*
- same as above (there's a whole bunch of these)LLVMRustWriteValueToString
- IIRC this is mostly debuggingLLVMRustLinkInExternalBitcode
- used during normal LTOLLVMRustLinkInParsedExternalBitcode
- used during normal LTOLLVMRustGetSectionName
- not sure where this came from...LLVMRustArrayType
- missing C API?LLVMRustWriteTwineToString
- I think more debugging/diagnosticsLLVMRustUnpackOptimizationDiagnostic
- diagnosticsLLVMRustUnpackInlineAsmDiagnostic
- diagnosticsLLVMRustWriteDiagnosticInfoToString
- diagnosticsLLVMRustGetDiagInfoKind
- custom for us I think?LLVMRustGetTypeKind
- missing C API?LLVMRustWriteDebugLocToString
- debugging API I thinkLLVMRustSetInlineAsmDiagnosticHandler
- dealing with inline asm diagnosticsLLVMRustWriteSMDiagnosticToString
- diagnosticsLLVMRustBuildLandingPad
- missing C API?LLVMRustBuildCleanupPad
- same as aboveLLVMRustBuildCleanupRet
- same as aboveLLVMRustBuildCatchPad
- same as aboveLLVMRustBuildCatchRet
- same as aboveLLVMRustBuildCatchSwitch
- same as aboveLLVMRustAddHandler
- same as aboveLLVMRustBuildOperandBundleDef
- same as aboveLLVMRustBuildCall
- same as aboveLLVMRustBuildInvoke
- same as aboveLLVMRustPositionBuilderAtStart
- same as above I think?LLVMRustSetComdat
- same as aboveLLVMRustUnsetComdat
- same as aboveLLVMRustGetLinkage
- same as aboveLLVMRustSetLinkage
- same as aboveLLVMRustConstInt128Get
- same as aboveLLVMRustGetValueContext
- same as aboveLLVMRustGetVisibility
- same as aboveLLVMRustSetVisibility
- same as aboveLLVMRustModuleBufferCreate
- serializing a module to memoryLLVMRustModuleBufferFree
- freeing aboveLLVMRustModuleBufferPtr
- reading aboveLLVMRustModuleBufferLen
- reading aboveLLVMRustModuleCost
- mostly a debugging helper