Caching of provided types

forki commented 8 years ago

I'm working with @tforkmann on a project and we try to use SQLProvider. Unfortunately I don't have access to the db so the type provider can't generate types on my machine. Is there a way for @tforkmann to commit "cached" types so that I can build on my machine and look at the code?

colinbull commented 8 years ago

Not currently, there was talk of serializing the type signatures to a file location. I don't think any work started on it thou. But it is definitely something that is possible.

pezipink commented 8 years ago

Possible yes, the core of it is pretty easy to implement but it has a larger impact across everything else and will have some dragons that need taming.

It is one of the 3 bigger ticket items that we would like to do should we ever get the time :)

forki commented 8 years ago

@pezipink so this reads like you accept the challenge of dragon taming ;-)

jus kidding, it's not that important. I will find a way to #IF out.

kerams commented 8 years ago

This would indeed be extremely useful. Currently, the type provider seems to hit the database on every single file save, completely killing intellisense until it succeeds or fails (which is not too uncommon for some reason).

gibranrosa commented 7 years ago

I've started to work on it today, there is something already, and it was surprisingly easy until now. The branch name is "serialize" in my fork. If you guys could check it out, it would be great! :)

Thorium commented 7 years ago

There are currently inmemory caches for generated tables and columns. Some notes:

.NET thead-safety, e.g. Visual Studio creates a lot of threads. Thats' why current are ConcurrentDictionary GetOrAdd.
Cache invalidation? How much memory is ok to use for e.g. if SQLProvider is run on webserver? What if schema changes?
What happens when user changes provider static parameters? Currently if it finds an error, it resets the generated types.
Connections: Current cloud architectures are pay-by-use so we would want to avoid unnecesary SQL-clauses. That's why there is a concurrent dictionary to hold connection cache in designtime.fs.
Offline-cache and multi-platform: Where to save data so that it would still work on both Windows and Mono

pezipink commented 7 years ago

A few more things off the top of my head to think about with respect to offline-caching:

A main selling point of the SQLProvider is that it accesses the schema lazily. Thus, we'd only want to cache the types that are actually used.
Partially cached types would probably need to be extended during "online" development. Do we also check for deltas, or simply have the user delete the cache in order to receive updated schema?
How much control do we give to the user for caching? Should it be fully controlled by a static parameter, or automatically attempt to cache when "online", only using types when a connection can't be made?
Various threading and IDE issues
Versioning? Perhaps this isn't that important since the user can delete their cache, but we wouldn't want a new version of the SQLProvider to suddenly break their compile due to an old cache.
It would be nice to avoid any library dependencies since we have managed this far no have none except the provided types api!

gibranrosa commented 7 years ago

Yes, the real world, scary! :/ Some answers I would propose:

It would be an extra feature, and it would have a parameter to activate it
The concernment about multi-platform I would say in a first moment, one more parameter with the path
I would say that this feature would be recomendable only for almost stable schemas. It would be only one more layer of caching. Or for sharing, as the original question of this issue.
And lastly, if it is to avoid library dependencies, in that case... I will keep this to myself, because serializing by hand, this I pass :p

The problem really IMHO is the threading, that already got me yesterday, and I will try to deal with it today. But in summary I propose this to be, at the first moment at least, an extra feature for those who would deal with a very big and stable schema, for whom the constant update of a schema that does not change is in itself a problem. And of course, that will be an initial version, that could be enhanced with the time. What do you think?

Anywho, thanks guys for the tips, and I will start to deal with this concernments. :D

colinbull commented 7 years ago

Just a thought on the taking dependencies, we could try and internalise the dependancy using ilrepack as part of the build. However I have no idea how this behaves with type providers it would however be interesting to find out..

Thorium commented 7 years ago

I think the whole universe is depending James Newton-King. We can load the dlls dynamically and call with reflection to pretend that we are still in our own universe, but all the other Nuget-packages load that to your server memory anyway.

pezipink commented 7 years ago

Yes, I wasn't saying we should never introduce dependencies, if they are they needed and it fits then it fits ...

Thorium commented 7 years ago

Just an idea, one way to do this could be:

Create a typeprovider method "Offline" taking a path as parameter.
For that create a SaveNow-property that would save it design-time like Ross' typeprovider games, giving the result as another property.
The save method would just serialize the concurrentdictionarys via some new interface call.
Then we maybe can have corresponding offline parameter, or maybe we would need another static parameter to load the file to memory when compiling.

ctx.Offline(@"c:\sqlprovider.cache").SaveNow.``Save failed: Not enough disk space``

That could be then extended with another save method that would save the context in a background thread every 3 minutes if the data connection is available.

pezipink commented 7 years ago

Interesting idea, very much non-standard! Would be cool to try as an experiment.

Thorium commented 6 years ago

Another maybe even more simple option for Newtonsoft would be using System.Data.Dataset as we currently can't live without System.Data anyway. Something like

    ///pseudo
    let saveSchema (location:string) =
        let ds = new DataSet()
        tableLookup |> Seq.map(fun t -> ds.Tables.Add(t.Value.Name, t.Value.Schema)) |> ignore
        columnLookup |> Seq.map(fun c -> ds.Tables.[0].Columns.Add(c.Value.[""].Name)) |> ignore
        pkLookup |> Seq.map(fun pk -> 
                       pk.Value |> List.map(fun key -> ds.Tables.[pk.Key].Columns.[key].Unique <- true )
        ) |> ignore
        relationshipLookup |> Seq.map(fun c -> (* ... *) ) |> ignore

        ds.WriteXmlSchema(location)
        ()

    let loadSchema (location:string) =
        let ds = new DataSet()
        ds.ReadXmlSchema(location)

        ds.Tables |> Seq.iter(fun t -> tableLookup.AddOrUpdate( (* ... *) ))
        // ...
        ()

Edit: I was thinking that this could helping either unit-testing or local caching, as the dataset could be filled with data as well... But maybe it would just make things unnecessary complex: 1) to cache to work correctly, all the key-value-pair-information has to be saved, 2) even if some dummy-data-insertion-method would be exposed, there wouldn't be easy way to check if the data is live and correct, 3) I expect we don't need an in-memory-sqlprovider.

object commented 6 years ago

After our conversation yesterday at F# Exchange I am reviewing the code to figure out how we can add this feature. Please check what I have in mind and correct me if some of my assumptions are not correct.

We can use DataSet capabilities to save and load data schema as @Thorium suggested so there's no need to bring additional dependencies.
An addition static SqlDataProvider parameter SchemaCache can be introduced to point to file(s) with data schema(s), more about it in p.5.
If SchemaCache is not set or points to a non-existing file, SqlDataProvider will behave as today, resolving the schema from the actual database. This will make it easy to configure cache once and enable it only when needed by placing the cache file into the specified location.
There will be a new SqlDataProvider method SaveSchema (may also have LoadSchema for symmetry but I am not sure about scenario when it may be used).
Now what to save? I think saving the whole schema (actually schemas) can be an overkill. I can only imagine how many megabytes it will take to save all schemas for the Oracle database in our organization, and ours is far from biggest ones. Here what can be done:

define a record type SchemaCache: { SchemaName: string FilePath: string }
specify schema cache setting when saving it, e.g. SaveSchema({ SchemaName = "main"; FilePath = @"C:\Temp\main.xml"})
since we already support tableNames for some databases, save only schema for the specified tables if it's specified, e.g. if tableNames are set to "products,customers" then SaveSchema will only save schema for those two tables.
I am only familiar with Oracle and MS SQL Server, so I don't know if there is any database that don't have a concept of schema/catalog/database name. If there is such, then SchemaName can be an empty string (or made an option). Then we need to define semantics of missing schemaName for databases that support that. I still think there is no use for caching everything in such case. Caching schema is a kind of a hack to solve lost connectivity problem, so developers must be explicit in specifying what exactly they need to cache to reduce chances of accidentally cached schemas that will later go out of sync.

It will be up to a developer to call SaveSchema. This call should probably be not a part of a code running in a production. Something that can be executed from a script perhaps.
Now when the schema(s) is saved, it will be retrieved whenever SchemaCache is set, so there will be no need to load it in client code explicitly using LoadSchema.

Example: SchemaCache = [{SchemaName = "main"; FilePath = @"C:\Temp\main.xml"}] For databases without support for multiple schemas (if any): [{SchemaName = ""; FilePath= @"C:\Temp\schema.xml"}]

NB! Having written these notes, I've found Owner parameter that can, depending on the database vendor, specify schema or database name. If this is sufficient to resolve schema name, we may not even need a record type SchemaCache and can simply specify the file path of the schema for the respective owner. This will make better reuse of existing configuration settings but leaves a question of how to save/load multiple schemas. IMHO it may be OK to restrict caching to a single schema, so SchemaCache will require Owner to be set for multi-schema databases, and if a project uses schemas from multiple owners it will need to define multiple context types. Something to discuss, but if we make such restriction, then SchemaCache parameter can be renamed to SchemaPath (string) and SaveSchema will have a single string parameter specifying the file path. The rest will be figured out from Owner and TableNames values.

Thorium commented 6 years ago

What are your views saving the whole database schema vs. saving only cached concurrent dictionaries content? The cache dictionaries should have all the fields needed to build success, but they are currently provider specific, so they could be saved via either common utility function or a new interface member.

object commented 6 years ago

Saving concurrent dictionaries will be more efficient than using legacy DataSet (and who loves XML?). The only argument in favor of DataSet schema is that it's already there and doesn't need bringing new dependencies. And it should be ok to deal with provider-specific dictionaries. But what will be a serialization format? JSON? Bring JSON.NET dependency just because of this feature?

P.S. SQLProvier already includes System.Runtime.Serialization that contain DataContractSerializer. Can this be used instead of adding Newtonsoft.Json?

object commented 6 years ago

And if we split the implementation into two parts:

Extending SqlDataProvider with new methods and instantiation parameters
Schema serialization format

then if we agree about 1, I can start working on it using whatever serialization, and we can finalize schema format later. So if you have any input on the part 1 (e.g. support for saving multiple schemas per context, naming of new parameters and methods etc.), please let me know.

object commented 6 years ago

More on serializers with .NET Core support: https://github.com/dotnet/corefx/issues/6564

Please consider other serializers in .NET Core such as DataContractSerializer (DCS), DataContractJsonSerializer (DCJS) and XmlSerializer, which are and will continue to be supported in .NET Core. Of course, you can also consider 3rd party serializers built on .NET Core.

Thorium commented 6 years ago

I think compile-time offline caching is not performace-critical, so probably any serialization will do. We can split the implementation into two parts.

Regardless of the serialization format, there are still two options:

Save the cache only, by the development time, the content that is required to build the system.
Or fetch and save the whole database schema.

Initially I would say that the first one sounds better, as some people are using huge Oracle schemas. I would prefer that we start with as simple solution as possible and solve more complex scenarios when needed.

object commented 6 years ago

I also think the 1st sounds better to begin with. I will do some experiments in a new branch and let you review results once I get something worthy.

object commented 6 years ago

@Thorium I forked the code, experimenting with it right now and I need some tips regarding how I can only save the part of the schema referenced by the current context, not the whole db schema. Here's what I did so far:

Defined SaveContextSchema method in SqlDesignTime.fs: yield ProvidedMethod("SaveContextSchema",[ProvidedParameter("filePath", typeof<string>)],typeof<unit>, invokeCode = fun args -> <@@ ((%%args.[0] : obj) :?> ISqlDataContext).SaveContextSchema((%%args.[1] : string)) @@>) :> MemberInfo
Extended ISqlDataContext with a new method SaveContextSchema: abstract SaveContextSchema : string -> unit
Added prototype of the implementation of SaveContextSchema in SqlRuntime.DataContext. The implementation is currently serializing tables using DataContractJsonSerializer, just to try that this works:

        member __.SaveContextSchema(filePath) =
            use con = provider.CreateConnection(connectionString)
            providerCache
            |> Seq.iter (fun prov -> 
                let prov = prov.Value
                let tables = prov.GetTables(con,caseSensitivity)
                tables
                |> List.iter (fun table->
                    use ms = new System.IO.MemoryStream()
                    let ser = new System.Runtime.Serialization.Json.DataContractJsonSerializer(table.GetType())
                    ser.WriteObject(ms, table);  
                    let json = ms.ToArray();  
                    System.IO.File.AppendAllText(filePath, System.Text.Encoding.UTF8.GetString(json, 0, json.Length))

When I test the implementation I see that SaveContextSchema saves everything, all database tables. This is of course a problem because just a JSON with all Oracle tables (no columns) take about 0.5 MB in our case. It can easily become 100 times larger with all column information. But I can't figure out how I can obtain the schema for only referenced tables/columns. Any tips?

pezipink commented 6 years ago

Agree the primary use case for the caching is NOT to serialize the whole db schema, and only the stuff that has been actually used so you can compile "offline" (as opposed to developing against "new" stuff offline). Whilst it could be useful sometimes - especially for small databases - the whole point of the type provider is to be able to access massive schemas without generating lots of anything in the process, be it code or data.

object commented 6 years ago

@pezipink So what is your take on this use case? Do you think it's acceptable to just cache everything (this looks to be a straightforward task) since we are talking about quite special case - if anyone want to have a schema offline they might as well afford generation of multimegabyte file? Or we should strive to find a way of caching schema just for the tables used but then I need some hints from you guys, can't figure out myself how I will obtain such schema.

Thorium commented 6 years ago

I think you should save only content of the provider's concurrent dictionaries, not tables. Each provider has those, e.g. Ms SQL server:

let pkLookup = ConcurrentDictionary<string,string list>()
let tableLookup = ConcurrentDictionary<string,Table>()
let columnLookup = ConcurrentDictionary<string,ColumnLookup>()
let relationshipLookup = ConcurrentDictionary<string,Relationship list * Relationship list>()

Are these populated well enough? That when you run code-editor, the already used tables schemas should be in those. So save and load for those only, not for the whole schema.

Thorium commented 6 years ago

Also I think you shouldn't run it on invokeCode as that is never invoked on design time, right? Rather add the code to AddMembersDelayed or AddXmlDocDelayed and return the save-result as a new property (if you use membersdelayed) or a string (if you use xmldocdelayed).

object commented 6 years ago

@Thorium But how do I get access to them from the common provider code since all lookup dictionaries are defined inside its respective provider? Should each provider expose lookup dictionaries?

Thorium commented 6 years ago

I would vote for a common tooling method(s) that saves the data, and call it inside each provider implementation. There is probably not enough common API that they could be exposed nicely.

pezipink commented 6 years ago

I think it is acceptable if that is what the user really wants, but it does go against the spirit of what I as initially trying to achieve, and bringing it closer to what SQLMetal does.

I agree, add it to the interface and have the providers serialize their dictionaries. You will need to be careful though because things like AddMemebersDelayed get called a bunch of times on different threads and other crazy things, likely you will need some locking, and maybe to be clever about how you serialize so it isn't trying to re-serialize the whole thing all the time. I foresee some unforeseen adventures for you !

Also, it is great you are having a go at this piece of work. Thanks ! :)

object commented 6 years ago

Alright, so I will look into caching provider-specific concurrent dictionaries and will use AddMembersDelayed. I will probably ask more questions on the way.

object commented 6 years ago

Question regarding differences in lookup dictionaries for different providers.

Majority of the providers use the following set of dictionaries: let pkLookup = ConcurrentDictionary<string,string list>() let tableLookup = ConcurrentDictionary<string,Table>() let columnLookup = ConcurrentDictionary<string,ColumnLookup>() let relationshipLookup = ConcurrentDictionary<string,Relationship list * Relationship list>()

However there are some exceptions:

Postgresql and SQLite use Dictionary (not ConcurrentDictionary) for relationship cache. Can we change it to ConcurrentDictionary?
Oracle has its own set: let mutable primaryKeyColumn : IDictionary<string,PrimaryKey> = null let columnCache = new ConcurrentDictionary<string,ColumnLookup>() let mutable tableCache : Table list = [] let relationshipCache = new ConcurrentDictionary<string, Relationship list * Relationship list>() Can we unify it with the rest?
ODBC doesn't have relationship dictionary (I suppose it doesn't support relations)

If these differences are not intentional, I can make all providers use the same set but then since we plan to expose new common functionality based on schema cache, should these collections be moved to a common type that combine them all? Then they can all share common schema cache implementation. /cc @Thorium @pezipink

Thorium commented 6 years ago

Dictionary was used before ConcurrentDictionary. VS threading is mysterious, so concurrent may be safer. Can be changed.
I know nothing about Oracle provider. However best practices in .NET has usually been that the container (dictionary itself) is immutable: "If replacing the collection is a goal, the preferred design pattern is to include a method to remove all the elements from the collection and a method to re-populate the collection. " - Microsoft
Yes I think ODBC doesn't have a generic schema that would desctibe the relationships, so the user has to write the joins manually. I usually prefer the manual way anyways.

One thing to consider is that you may have multiple database connections, e.g. both MySQL and Postgres inside one program, or two same type connections to different databases. So now that the containers are not global but inside a provider, the caches wont mess each other. As long as that works, the cache implementation can be common.

object commented 6 years ago

Of course container instances should be per provider/connection. I meant we could have a common type that will wrap schema caches but each provider/connection will continue create and manage instances of such common type.

Thorium commented 6 years ago

Yes, that would be good. :-)

object commented 6 years ago

I have implemented caching of schema which I am currently testing in my branch (https://github.com/object/SQLProvider/tree/schema_cache), works with Oracle, will run some tests with MS SQL Server later.

This appeared to be quite a big change.

I began with generalization of lookup collections (tables, columns, primary keys, relationships), so now all providers store these schema elements using common type.
Next was to serialize and deserialize a new type SchemaCache. This was implemented using DataContractJsonSerializer, so we don't need to drag new dependencies.
A new method SaveContextSchema has been added to an SQL provider. I named it SaveContextSchema, not just SaveSchema or SaveSchemaCache to emphasize that it's not a full schema but a part of it used in the current scope.
A new instantiation parameter ContextSchemaPath has been introduced. Contains a refererence to previously stored context schema.
The last and biggest change was to go through the whole code base and ensure every place that uses connection to an actual database has a fallback for offline mode. There were many places, most of them inside the huge function createTypes (part of SqlDesignTime.fs).
The set (tables, columns, primary keys, relationships) doesn't cover all schema elements. For example, stored procedures are not covered, GetTableDefinitions and GetColumnDefinitions require live connection. If client code depends on them, it won't support offline mode. But I believe majority scenarios don't need it.

With all these changes I can now write the following code:

type sql = SqlDataProvider< 
              ConnectionString = "my connection string",
              DatabaseVendor = Common.DatabaseProviderTypes.ORACLE,
              ResolutionPath = ResolutionPath,
              IndividualsAmount = 1000,
              UseOptionTypes = true >
let dbContext = sql.GetDataContext()

let profiles = 
    query { for p in dbContext.MyDatabase.Profiles do
            where (p.Media = "video") 
            select p.Description }
profiles |> Seq.toList

dbContext.SaveContextSchema(@"C:\TEMP\Schema.txt")

Then I can disconnect machine from the internet and replace the code with the following:

type sql = SqlDataProvider< 
              ConnectionString = "my connection string",
              DatabaseVendor = Common.DatabaseProviderTypes.ORACLE,
              ResolutionPath = ResolutionPath,
              IndividualsAmount = 1000,
              UseOptionTypes = true,
              ContextSchemaPath = @"C:\TEMP\Schema.txt" >
let dbContext = sql.GetDataContext()

let profiles = 
    query { for p in dbContext.Granitt.Profiles do
            where (p.Media = "video") 
            select p.Description }
profiles |> Seq.toList

And the code will compile.

Let me know what do you think about this approach. It would be great to test it on other databases than Oracle and MS SQL Server.

@Thorium @pezipink

Thorium commented 6 years ago

What happens on runtime, if my Azure SQL-connection fails for 10sec? I will get error and not incorrect query results, right?
People who build the provider off-line are probably using CI. So if I do the build it in the continious delivery machine, I probably want to do the release package also from that machine. It shouldn't matter, right? So that the real executable build off-line is as functional as the one build with database-connection online?

object commented 6 years ago

It's up to us to define behavior, but this is how it is currently implemented.

If ContextSchemaPath is not set, provider works as today, it requires connection to the database at design and compile time.
If ContextSchemaPath is set but points to a non-existing file, provider also works as today and requires connection to the database.
If ContextSchemaPath is set and points to an existing file, provider doesn't use ConnectionString to resolve schema and instead uses schema cache serialized from the specified file.

Reasoning behind this behavior: it would be nice to avoid unnecessary edits of the SqlProvider constructor properties just to enable and disable offline mode. SqlProvider construction parameters can be set once, and offline mode can be turned on based on the presence of the specified file. So it will be completely up to the development team when and where the place scheme cache file to control offline mode.

Thorium commented 6 years ago

I have MSAccess, SQLServer, MySql, Odbc, Postgres and SQLite so I can do some testing when I'm not so busy.

object commented 6 years ago

I plan to do SQL Server testing on Linux tonight from MacBook/mono. This should provide some variation of environments :-)

Thorium commented 6 years ago

Procedures are quite commonly used, GetSprocs returns Sproc list so that maybe could be added to support the sprocs, if the params do serialize correctly (IDbConnection -> QueryParameter list functions). Individuals...no-one probably uses them anyways other than exploring the data at design-time.

object commented 6 years ago

If GetSprocs can be added as easily as tables and columns, I can extend SchemaCache to include them.

Thorium commented 6 years ago

I thought that this would save the file on design time. Instead it saves the cache at runtime. Ok, that's fine, just needs one more step for the user to run the code... Is there all the information at cache on runtime?

object commented 6 years ago

I just tested with SQL Server on Linux and SQL Provider built on Mac. Worked fine.

"I thought that this would save the file on design time. Instead it saves the cache at runtime."

I guess I didn't fully grasp this scenario. Especially since I was testing it from an F# script where there is not much practical difference between design-time and runtime. I can certainly change it but can you please clarify step by step how schema cache for the current context will be saved at design-time? I.e. anything I add to ISqlDataContext will be runtime, right? Where should I define SaveContextSchema and how will developer trigger its execution so it will include schema for all entities referenced in the scope?

object commented 6 years ago

I've re-read one of your comments. Here is what you meant, right?

dbContext.ContextSchema(@"C:\TEMP\Schema.txt").SaveNow.Error

But can it be shorter?

dbContext.ContextSchema(@"C:\TEMP\Schema.txt").Save

Thorium commented 6 years ago

Tested with MySQL, it works, but the problem is that ctx.SaveContextSchema("c:\\test2.txt") has to be called as the last line in your program. Which is not easy if you are not going to execute the whole program (e.g. I have a web server, and each method is called only when action is done, so I should surf the whole UI through before saving).

If we could do it design time, the design time might have loaded already everything for the compilation? (I'm not sure but could be?) Something like this (psedo, not working code) to SqlDesignTime.fs (expecting that ContextSchemaPath would be always defined and we would check the IsOffline some other way):

let saveResponse = ProvidedTypeDefinition("SaveContextResponse",None, isErased=true)
saveResponse.AddMemberDelayed(fun () -> 
    lock mySaveLock (fun() ->
        let result = 
            if String.IsNullOrEmpty contextSchemaPath then
                try
                    if not (System.IO.File.Exists contextSchemaPath) then
                        System.IO.File.Create(contextSchemaPath).Dispose()
                    prov.GetSchemaCache().Save contextSchemaPath
                    "Saved " + contextSchemaPath + " at " + DateTime.Now.ToString("hh:mm:ss")
                with
                | e -> "Save failed: " + e.Message
            else "ContextSchemaPath is not defined"
        ProvidedMethod(result,[],typeof<unit>, invokeCode = empty) :> MemberInfo
    )
)
let m = ProvidedMethod("SaveContextSchema",[], (saveResponse :> Type), invokeCode = fun args -> <@@ "" @@>)
m.AddXmlDocComputed(fun () -> 
    if String.IsNullOrEmpty contextSchemaPath then "ContextSchemaPath static parameter has to be defined to use this function."
    else "Schema location: " + contextSchemaPath + ". Write dot after this method to see the save result."
    )
yield m :> MemberInfo

object commented 6 years ago

"If we could do it design time, the design time might have loaded already everything for the compilation?"

Will it? My tests showed that the cache contained all tables but columns for only tables referenced in the current scope. So if I have other source files referencing other tables their schema would not be saved. This is partly why I went this path - I have to put some code referencing all tables that I want to be cached in order to save the schema. So most likely I will need to write a small design-time script for schema cache saving.

An alternative to this would be to force saving the complete schema for the whole database. But in the comments above we sort of agreed that we don't want that.

Another (new) option might be defining a method for explicit referencing certain schema elements and saving them in the cache, e.g.:

ctx.ContextSchema(@"C:\Temp\Schema.txt", [ctx.Main.Customers; ctx.Main.Product; ctx.Main.Orders]).SaveNow

Thorium commented 6 years ago

I don’t know how easy it’s to get other than the static parameters design-time, without referencing FSharp.Compiler.

--

Tuomas Sent from my Windows 10 phone

From: Vagif Abilov Sent: 19 April 2018 18:57 To: fsprojects/SQLProvider Cc: Tuomas Hietanen; Mention Subject: Re: [fsprojects/SQLProvider] Caching of provided types (#178)

I've re-read one of your comments. Here is what you meant, right? dbContext.ContextSchema(@"C:\TEMP\Schema.txt").SaveNow.Error But can it be shorter? dbContext.ContextSchema(@"C:\TEMP\Schema.txt").Save.Now — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

Thorium commented 6 years ago

”My tests showed that the cache contained all tables but columns for only tables referenced in the current scope”

What is the scope here? Does that mean that if you have 100 different queries, there are actually 100 different concurrent column caches? I think that is not what we want.

--

Tuomas

Sent from my Windows 10 phone

From: Vagif Abilov Sent: 19 April 2018 19:35 To: fsprojects/SQLProvider Cc: Tuomas Hietanen; Mention Subject: Re: [fsprojects/SQLProvider] Caching of provided types (#178)

"If we could do it design time, the design time might have loaded already everything for the compilation?" Will it? My tests showed that the cache contained all tables but columns for only tables referenced in the current scope. So if I have other source files referencing other tables their schema would not be saved. This is partly why I went this path - I have to put some code referencing all tables that I want to be cached in order to save the schema. So most likely I will need to write a small design-time script for schema cache saving. An alternative to this would be to force saving the complete schema for the whole database. But in the comments above we sort of agreed that we don't want that. Another (new) option might be defining a method for explicit referencing certain schema elements and saving them in the cache, e.g.: ctx.ContextSchema(@"C:\Temp\Schema.txt", [ctx.Main.Customers; ctx.Main.Product; ctx.Main.Orders]).SaveNow — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

object commented 6 years ago

No, there is only one schema cache per provider. So each query just adds to the context.

But we probably see two main usage scenarios: save a schema only for queries referenced in the context and save a schema for an application that uses many different queries in its different components, so it's hard to extract them all in one place, and then it's better to save the whole database schema.

What about ContextSchema and FullSchema with SaveNow design time method? And rename static parameter to SchemaCachePath.

Perhaps we should not give a developer a choice of selecting a filename, just schema folder path, so provider can store there files using internal names that will contain timestamp, as you suggested in your example.

I will be away until Tuesday, but will be checking the thread.

Thorium commented 6 years ago

Created PR for you to do the saving at design-time. Try it, and let me know what you think. Works well for me, saves the whole used schema. So that you can use it to build the project.

What comes to stored procedures, I tried quickly this: Added a storage for sprocs:

and internal SchemaCache =
    { ...
      Procedures    : ResizeArray<Sproc>
      IsOffline     : bool }
    with
        static member Empty = { 
            ...
            Procedures = ResizeArray<Sproc>()
            IsOffline = false }

Then just modified procedure calls from e.g. member __.GetSprocs(con) = MSSqlServer.connect con MSSqlServer.getSprocs to:

        member __.GetSprocs(con) = MSSqlServer.connect con MSSqlServer.getSprocs
            if schemaCache.IsOffline then schemaCache.Procedures |> Seq.toList
            else
                let r = MSSqlServer.connect con MSSqlServer.getSprocs
                schemaCache.Procedures.AddRange r
                r

...but as expected, the DataContractJsonSerializer fails to serialize the type Sproc. It's a recursive f# type with functions.

fsprojects / SQLProvider

Caching of provided types #178