Open SomethingAwry opened 7 years ago
SQL Server long, long ago (~2003) decided to implement a (I believe crazy) set of additional checks on .NET IL above and beyond what is done by peverify.exe. I presume the people who decided to do this aren't even at Microsoft any more, and the decision has been lost in history. I've never seen these additional checks documented anywhere, nor any explanation of what soundness result is being established by the checks, and I don't know how to put it under automated test. This makes it really, really hard for other languages and code-generating tools to make sure that code can be run in SQL Server.
We did try pretty hard to generate SQLServer-friendly constructs in 2005-2007 back when ".NET code running in SQL Server" was the big new thing. It is still a useful scenario, and if we had PRs with some user validation that we were following the rules - and some documentation of the rules with automated checking in the F# test suite - then we might be able to accept it, if it doesn't change codegen too much.
They have an SQLCLR that is - as far as I know - implemented on top of SQLOS, in fact the plan is to support SQLCLR on Linux (please see the comments): https://blogs.technet.microsoft.com/dataplatforminsider/2016/12/16/sql-server-on-linux-how-introduction/
We are currently planning to support existing SQLCLR functionality for SAFE assemblies only (which works in the current CTP). This has no relation to .NET Core.
Pleae note, the newer version will have even more restriction in place, eg.: the 2017 version will only allow SAFE assemblies.
My guess is that they implemented something like Joe-E which is a subset of the Java programming language intended to support programming according to object-capability discipline for SQLCLR:
https://en.wikipedia.org/wiki/Joe-E https://people.eecs.berkeley.edu/~daw/papers/joe-e-ndss10.pdf
_Classes may not have mutable static fields_, because these create global state.
Catching out-of-memory exceptions is prohibited, because doing so allows non-deterministic execution. For the same reason, finally clauses are not allowed.
_Methods in the standard library may be blocked_ if they are deemed unsafe according to taming rules. For example, the constructor new File(filename) is blocked because it allows unrestricted access to the filesystem.
There are lot's of restriction what's available in SQLCLR:
https://docs.microsoft.com/en-us/sql/relational-databases/clr-integration/database-objects/clr-integration-programming-model-restrictions https://docs.microsoft.com/en-us/sql/relational-databases/clr-integration-security-host-protection-attributes/host-protection-attributes-and-clr-integration-programming
http://www.sqlservercentral.com/articles/Stairway+Series/104406/ http://www.sqlservercentral.com/articles/Stairway+Series/109905/
@zpodlovics Thanks for the links and the context!
I still find it kind of extraordinary that there are these adhoc extra limitations for this one particular execution environment, and that there isn't a set of tools to help library writers and compiler implementers stick within those limitations. I mean, it's not like SQL is the only environment that wants to host CLR code, nor that these rules somehow create a sound sandbox environment. Anyway...
The main problem is that FSharp.Core uses some of these constructs, e.g. non-readonly static fields to store amortized values such as the empty list, or the empty map and so on. It might be that we can adjust the compilation of FSHarp.Core.
Anyway, if you or someone else can get the compilation of FSharp.Core to the state where its compiled form passes these tests, and we can keep it that way through automated testing, then I think we would accept PRs in this area.
Hi there, @dsyme (and others). Just to clarify a few things here:
I don't think that these restrictions are "ad hoc" or arbitrary. The goal is to ensure that stability of the SQL Server process as much as possible. To that end:
Environment.UserInteractive == false
). So no need for Dialogs, Forms, graphics, fonts, etc.AUTHORIZATION
database principal). Hence, ALL callers of a particular method share the single App Domain that the Assembly (and Class) have been loaded into. This makes it far too easy to have race conditions. For example: SQLCLR assembly throws error when multiple queries run simultaneouslyFSharp.Core uses some of these constructs, e.g. non-readonly static fields to store amortized values such as the empty list, or the empty map and so on.
I am only barely familiar with F# so I apologize if this is a silly question, but if these objects are empty, do they need to be writable? Is it that something about them (structure, etc) changes? I believe readonly
static variables can be set in initialization and via the class constructor. Is this something that seems viable, or am I way off here?
If these non-readonly static fields a) need to be writable, and b) are static
for performance reasons, is it possible to forgo the performance gain and make them non-static fields, or something along those lines? Meaning, is it possible to accept that performance, at least for some scenarios, in this particular CLR host is slightly worse than other CLR hosts? If so, can compiler directives and/or other mechanisms be used to create a separate build for SQL Server / SQLCLR?
Even if either the static fields can be marked as readonly
or a separate build can be achieved with a modified approach to these fields, it is still possible to not meet the goal of having a SAFE
Assembly IF FSharp.Core references .NET Framework libraries that are not on the "supported" list.
Contrary to what @zpodlovics noted 2 comments above, SQL Server 2017 is not restricted to SAFE
Assemblies only, at least not across all platforms. SQL Server on Windows still fully supports EXTERNAL_ACCESS
and UNSAFE
Assemblies. It is only SQL Server 2017 (and newer, presumably) on Linux, as well as SQL Server via Amazon RDS, that supports SQLCLR only for SAFE
Assemblies. (At this time, Azure SQL Database still does not support SQLCLR in any capacity).
The security change with SQL Server 2017, even on Windows, is that now the requirements for loading an UNSAFE
Assembly are enforced for both SAFE
and EXTERNAL_ACCESS
Assemblies. Meaning, you need to strong-name, or sign with a Certificate, each Assembly, and first load that strong name key (Asymmetric Key in SQL Server) or Certificate into master, create a Login from that Asymmetric Key / Certificate, and grant that Login the UNSAFE ASSEMBLY
permission. This is required even on Linux where EXTERNAL_ACCESS
and UNSAFE
Assemblies aren't even supported. For more details of this mess, please see: SQLCLR vs. SQL Server 2017, Part 1: “CLR strict security” – The Problem
As far as automated testing goes, I'm not sure if it's possible to use Reflection to walk through all referenced .NET Framework classes and methods to see if a Host Protection Attribute (HPA) has been defined, but that might be something to try. Of course, that won't catch usage of non-readonly static variables, but should catch stuff that PEVERIFY.EXE wouldn't. Of course, you could also install a copy of SQL Server Express LocalDB on the build server that can spin up automatically when an attempt is made to connect to it. The CI process could simply attempt to execute the CREATE ASSEMBLY
statement and report back any errors if encountered. LocalDB will shutdown a few minutes after the connection is closed.
Hope this info helps :-). Take care, Solomon..
@srutzky Thanks for the info, it's the first explanation I've seen of the SQL checks since 2003.
The restrictions feel adhoc since they were invented by SQL server in 2004-2005 without, AFAIK, documenting it until now. Good to have the docs.
IF FSharp.Core references .NET Framework libraries that are not on the "supported" list.
FSharp.Core doesn't reference anything that's not on that list. It basically references nothing but System.Core
, System
and mscorlib
.
There is one true global mutable I think, accessed via Async.DefaultCancellationToken.
AFAIK FSharp.Core is about the safest component around, it's just a functional programming library. Perhaps SQL Server could add it to its safe list. It's Microsoft-signed so the strong name should be enough.
I believe readonly static variables can be set in initialization and via the class constructor. Is this something that seems viable, or am I way off here?
The way F# initializes values is slightly different to C#. Initializers for F# are run for an entire file, not class-by-class, so all static data declared in a file is initialized at once. A static constructor is used under the hood but it may initialize fields from multiple classes.
This is entirely sound and valid according to ECMA 335, but means we can't mark fields as "readonly" and still be verifiable. The fields are, however, private. From the perspective of F# the added checks seem pointless, since the library is safe.
FWIW as a compiler writer and language developer it's extremely hard to predict from ECMA 335 that some tool will impose the restriction that "static data must all be marked readonly, even private data". It's basically impossible to shape a language/library design and implementation if tools are free to impose extra adhoc rules like this...
@dsyme Sorry for the delay in responding, just been so incredibly busy... I still think it would be awesome to get FSharp.Core working in SQL Server / SQLCLR as a SAFE
assembly (if possible), so I will do what I can to assist in this effort..
@srutzky Thanks for the info, it's the first explanation I've seen of the SQL checks since 2003.
Yer quite welcome 😺
The restrictions feel adhoc since they were invented by SQL server in 2004-2005 without, AFAIK, documenting it until now. Good to have the docs.
I think the documentation was actually there, at least from the release of SQL Server 2005, but it was likely not well advertised and not easy to find unless you knew to look for it there.
FSharp.Core doesn't reference anything that's not on that list. It basically references nothing but
System.Core
,System
andmscorlib
.
This definitely helps.
There is one true global mutable I think, accessed via Async.DefaultCancellationToken.
I'm not sure what, if any, impact this has. I think we will find out if / when we get passed the static variable issue.
AFAIK FSharp.Core is about the safest component around, it's just a functional programming library. Perhaps SQL Server could add it to its safe list. It's Microsoft-signed so the strong name should be enough.
It's probably not as simple as that to get a library added to the "supported" list. The libraries on that list are guaranteed:
Initializers for F# are run for an entire file, not class-by-class, so all static data declared in a file is initialized at once. A static constructor is used under the hood but it may initialize fields from multiple classes.
I could be misunderstanding something here, but this does not sound like a conflict. Currently SQL Server appears to be initializing static class variables when each class is first referenced (I just tested this in SQL Server 2017 CU14). However, I don't see the practical harm in FSharp.Code initializing static class variables for all classes at the same time.
This is entirely sound and valid according to ECMA 335, but means we can't mark fields as "readonly" and still be verifiable. The fields are, however, private. From the perspective of F# the added checks seem pointless, since the library is safe.
I don't understand the "we can't mark fields as readonly and still be verifiable" part, but don't know that I need to either 😉 . However, the fields being private is irrelevant to the "readonly
" requirement. This requirement exists because the App Domain is shared across all Sessions (i.e. SPIDs) in SQL Server. App Domains are per Database, per Owner (i.e. the AUTHORIZATION
database principal). Hence, ALL callers of a particular method share the single App Domain that the Assembly (and Class) have been loaded into. This makes it far too easy to have race conditions. For example: SQLCLR assembly throws error when multiple queries run simultaneously.
That being said, if you simply cannot mark the static class variables as "readonly
", then we need to find another way. Do the values of those static class variables ever change after being initialized? If not, then there is a solution that would probably work (though some might consider it sketchy): decorate the static class variables with the CompilerGeneratedAttribute (the "Remarks" section even mentions SQL Server).
That being said, if you simply cannot mark the static class variables as "readonly", then we need to find another way. Do the values of those static class variables ever change after being initialized?
They are never changed after initialization, with the exception of Async.DefaultCancellationToken
If not, then there is a solution that would probably work (though some might consider it sketchy): decorate the static class variables with the CompilerGeneratedAttribute (the "Remarks" section even mentions SQL Server).
So it does! Yes, I think that may work
Do the values of those static class variables ever change after being initialized?
They are never changed after initialization, with the exception of
Async.DefaultCancellationToken
decorate the static class variables with the CompilerGeneratedAttribute (the "Remarks" section even mentions SQL Server).
So it does! Yes, I think that may work
Awesome! So we are at least a little closer 😄
Two questions:
You mentioned that Async.DefaultCancellationToken
does (or at least can) change after initialization. Does this create the potential for a race condition given the shared nature of the App Domain, hence shared nature of that static variable? Or, is the value stored in it valid for any session reading it?
Looking ahead at deployment considerations (trying to make it as painless as possible), you mentioned that the library is Microsoft-signed. Does this include signing with a certificate? And if yes, is it the same certificate used to sign the .NET Framework 4.x libraries? (there will be follow-up questions if certificates are not being used).
You mentioned that Async.DefaultCancellationToken does (or at least can) change after initialization. Does this create the potential for a race condition given the shared nature of the App Domain, hence shared nature of that static variable? Or, is the value stored in it valid for any session reading it?
Yes it does - see here, it is not protected via lock. We could protect it
Looking ahead at deployment considerations (trying to make it as painless as possible), you mentioned that the library is Microsoft-signed. Does this include signing with a certificate? And if yes, is it the same certificate used to sign the .NET Framework 4.x libraries? (there will be follow-up questions if certificates are not being used).
@KevinRansom @brettfo Can you confirm?
thanks!
I don't know how the dotnet/arcade SDK handles signing; I only know that it uses the Microsoft
key.
@tmat might know about signing/certs, though.
Yes, our assemblies are signed with SHA2 AuthentiCode certificate.
@tmat Thanks for that info! It actually reminded me that I can verify this myself: since FSharp.Core is available via NuGet, I just need to add it to a project and check FSharp.Core.dll's properties. So that's what I did. It appears to be signed with the same Certificate(s) used for the .NET Framework v4.x libraries. And that's very good news as it makes the deployment script much, much easier.
In fact, I have already tested this by loading the MS certificate (the SHA1 digest cert with thumbprint = 9DC17888B5CFAD98B3CB35C1994E96227F061675
), and then I was able to create the FSharp.Core assembly in SQL Server 2019, with CLR strict security
enabled (ideal), and in a database with TRUSTWORTHY OFF
(ideal). Of course, the assembly was marked as UNSAFE
, but the result of what we are doing here will be that it loads as SAFE
.
Assuming that:
CompilerGenerated
attribute can be added to static variables that are not marked readonly
,Async.DefaultCancellationToken
is addressed, andthen I can take care of the deployment aspect of this, and will assist on the automated testing aspect. At this point I am not sure how to proceed, but here is a general overview of what will be involved (as far as I can tell, of course):
Ultimately we need a completely self-contained deployment script that can be added to a project as a pre-release step, or handed to a DBA to execute once wherever needed, etc. Providing only the DLL and instructions on how to deploy it will only cause stress (trying to reinvent this wheel), lead to bad security practices (trying to get this done the quickest way possible), or giving up on F# for SQLCLR.
0x0D4A....
), but allows for setting a max line length, at which point it adds a \
for T-SQL line-continuation and proceeds on the next line. Setting a max line length makes the deployment script more readable and more manageable, at least in SSMS where it doesn't like super-long lines. We don't need to use BinaryFormatter if we are using something more programmatic like T-4 templates, but for a CMD script we will. I haven't tried PowerShell but I suppose that's an option as well.[master]
if not already there (using FROM BINARY = 0x....
)UNSAFE ASSEMBLY
permissionFROM 0x....
), else load updated assembly using ALTER ASSEMBLY
. This should prevent needing to removing dependent objects (other assemblies that reference this one, and T-SQL wrapper objects pointing to those assemblies)I have used this pattern before, so this script is already partially complete.
I'm not sure if you want to include the final deployment script in the automated testing. If you do, here is what's needed:
'CLR strict security'
enabled (if SQL Server 2017 or newer), and TRUSTWORTHY OFF
for the test DB. Without ensuring these two settings, the security aspect of the test (the part requiring the certificate and its login) will not truly be tested.At this point, as long as the FSharp.Core assembly has been created, then success! At least to a degree. The goal is to get FSharp.Core loaded as a SAFE
assembly as it will help increase adoption given that folks tend to shy away from UNSAFE
assemblies, and platforms such as Azure SQL Database Managed Instances only allow SAFE
assemblies. But, loading it as SAFE
doesn't mean that it can execute as SAFE
since the verification done at load-time (whether from disk / binary or upon loading into the App Domain) can't catch everything. So, to go one step further in testing, we could include a simple SQLCLR assembly with one or two simple functions and/or stored procedures, and have an additional test step for loading that assembly, creating the T-SQL wrapper objects, and executing them (all via SQLCMD.EXE).
Sorry for the lengthy post, but this stuff needed to be documented somewhere to get a sense of scope. If there is a more appropriate place to put the above info, just let me know.
At this point, the only major hurdle (that I can see) is resolving the concurrency issue related to Async.DefaultCancellationToken
. Of course, something to consider is: if this static variable is only used when someone explicitly codes async functionality (as opposed to it being used internally even if the user assembly doesn't attempt anything async), then we have the option of simply stating that Phase 1 of getting this to work "seamlessly" in SQLCLR does not support async functionality. And then we solve that issue later. In this scenario, we only need the static variables decorated with CompilerGenerated
. Thoughts?
@dsyme @vzarytovskii @srutzky
Is there something specific me and the boys, e.g., @august-alm, can help out with to get this mess sorted out? Since the issue has been moving in and out of F# Compiler and Tooling there might be more to the story than I’m made aware of from this thread, any pointers?
My company works on a boutique compiler for the CLR and we also have the main parts of our core banking system at current being dependent on both SQLCLR and F#, so I we have the incentives sort this out smoothly.
I want to use F# to create CLR functions for SQL Server but I cannot publish them automatically because FSharp.Core is not "safe" as far as SQL Server is concerned.
Specifically, I get the error message:
Of course, it may be that some other resolution to 'empty' is needed other than the one suggested above, but it's the first message I've seen stating why FSharp.Core isn't safe, but I don't have the chops to address it without the worry of mucking up something else.
Details
A Visual Studio (2017) solution with two projects: 1) an F# library, 2) a SQL Server Database Project.
A simple library could be:
The database project can be empty, just add a reference the library. Building/Publishing the DB project will do the work of creating the deploy SQL for an assembly binary and making the function(s) available in T-SQL.
Try to publish the DB project to a SQL Server; fails with the above message.
Contrariwise, the same DB project with a C# library auto-published works fine.
Note
Strictly speaking, the 'direct translation' of a working C# library would look like this:
but doesn't make a difference.