CoatiSoftware / SourcetrailDB

Library to export Sourcetrail compatible database files for writing custom indexers
Apache License 2.0
287 stars 65 forks source link

Support for Swift #13

Open BrunoMiguens opened 4 years ago

BrunoMiguens commented 4 years ago

Would be awesome to have support for Swift.

I'm available to help you out!

mlangkabel commented 4 years ago

I agree, this would be really great! The first thing to do would be to find a framework that can extract the relevant information from Swift code. Probably this would be some kind of compiler or language server. If running it on some Swift code yields the resired outcome (e.g. finding symbol definitions and relations (e.g. calls, usages, etc.) between those symbols), the next step would be to extend SourcetrailDB with bindings to whatever language the framework that we found is written in.

You can find a small step by step description of this process in our Language Extension Guide

If you or anyone else has some experience with Swift and know or can find a framework to use, please let us know!

BrunoMiguens commented 4 years ago

I do have experience with Swift, so I'm ready to help. Just need to check the compiler or language server part of the process... So any help is welcome

Swift has a way to access the AST by using the command line or the library SwiftSyntax:

xcrun swiftc -frontend -emit-syntax ./File.swift

The result is a JSON, which may be problematic or not???

mlangkabel commented 4 years ago

Swift is statically typed, so writing Swift language support for Sourcetrail with full coverage of the indexed code should be doable! Ok, to get started, please read the "First Things First" section of the language extension guide linked above to get an estimate on what we need to continue :)

BrunoMiguens commented 4 years ago

I've read the First Things First and pretty much the rest of the doc and to be honest I'm kind of lost... perhaps is just me

mlangkabel commented 4 years ago

What are your questions? Maybe I can help :)

LouisStAmour commented 4 years ago

Swift ... appears to be harder than it should be, likely due to how rapidly Swift was developed. It looks to me like SwiftSyntax is a nice interface for ... well ... syntax ... but if you want a control flow graph, that appears to require SIL format, where we're back in C++ territory. The good news is you won't need SWIG, the bad news is, it's C++ everywhere. ;-)

Anyway, https://github.com/apple/swift/blob/0dc8e06d39d7d2145916d72c4324bd3414de9580/include/swift/SIL/SILLocation.h#L32-L38 seems to have the ability to switch between a SIL CFG and a Swift ASTNode. But there are very, very few open source examples of doing this. The one example I can find in the official Swift repo is https://github.com/apple/swift/blob/e544d367ac0a55fcfc29927e9b2397e11bdb534d/lib/SIL/SILVerifier.cpp#L1070

A related effort appears to be https://github.com/themaplelab/swift/wiki/UCOSP-W2018---Starting-Email which led to https://github.com/themaplelab/swan/wiki/ARCHIVE and eventually https://github.com/themaplelab/swan

The magic appears to happen in https://github.com/themaplelab/swan/blob/7a3fe1b4b20615361d28430e331366f48592edc7/ca.maple.swan.translator/include/WALAInstance.h and https://github.com/themaplelab/swan/blob/7a3fe1b4b20615361d28430e331366f48592edc7/ca.maple.swan.translator/lib/WALAInstance.cpp#L43:20 and https://github.com/themaplelab/swan/blob/7a3fe1b4b20615361d28430e331366f48592edc7/ca.maple.swan.translator/lib/InstructionVisitor.cpp#L44:26 (that last file is not quite as pretty as I'd like, but the original SIL code looks even scarier, so I'll roll with it.)

Another version: https://github.com/Polidea/SiriusObfuscator-SymbolExtractorAndRenamer/blob/80649e995557d56ccb2c16a77b1e50fd7a1297d7/swift/tools/sil-func-extractor/SILFunctionExtractor.cpp#L278 and https://github.com/Polidea/SiriusObfuscator-SymbolExtractorAndRenamer/blob/80649e995557d56ccb2c16a77b1e50fd7a1297d7/swift/lib/Serialization/DeserializeSIL.h#L143 which uses an API which loads a serialized SIL output https://github.com/apple/swift/blob/master/lib/Serialization/SerializedSILLoader.cpp#L24 and well, deserializes it (also known as "parsing") https://github.com/apple/swift/blob/master/lib/Serialization/DeserializeSIL.h It doesn't look like deserialization automatically creates a CFG. In particular, it's unclear how a deserialized SIL could still maintain references or pointers to the original AST, though maybe that would be more evident in an example.

A blog post from the Polidea folks is at https://www.polidea.com/blog/how-to-build-swift-compiler-based-tool-the-step-by-step-guide/

Search for "CFG" here and you'll see that SIL appears to have been designed for diagnostics and control flow mapping back to source information: https://llvm.org/devmtg/2015-10/slides/GroffLattner-SILHighLevelIR.pdf

ZkHaider commented 3 years ago

Why is this issue closed? @BrunoMiguens

Swift has a language server protocol

ZkHaider commented 3 years ago

https://github.com/apple/sourcekit-lsp

@BrunoMiguens

mlangkabel commented 3 years ago

@ZkHaider, right. Swift support has not been implemented, so we can keep this one open. There is a discussion on using the language server protocol on the Sourcetrail issuetracker.

olbrichj commented 3 years ago

Since there has been no progress regarding Swift or as far as I see adding a language server protocol. Has anyone looked into SoureKit? https://www.jpsim.com/uncovering-sourcekit/

There is a Swift implementation to communicate with it (but it can also be run as in the command line): https://github.com/jpsim/SourceKitten

mlangkabel commented 3 years ago

Looks like SourceKit would be suitable for doing the heavy lifting for a Sourcetrail indexer. But that would just work for Mac OS, right?

olbrichj commented 3 years ago

SourceKitten:

On Linux, SourceKit is expected to be located in /usr/lib/libsourcekitdInProc.so or specified by the LINUX_SOURCEKIT_LIB_PATH environment variable.

So the only system I don't know about is Windows.

One question I have is about interoperability between languages. Do we have to write an entire indexer for Swift -> Obj-C -> Obj-C++ -> C++ (where I guess that Obj-C and C++ will cover the Obj-C++ part)?