Open drassokhin opened 9 years ago
what would be a start and end of session? SPARQL is a stateless protocol and all requests are separated from each other
This suggestion has nothing to do with the SPARQL protocol and applies only to connections to Virtuoso databases via native Virtuoso drivers (ODBC, JDBC, ADO.NET, etc.).
On Apr 9, 2015, at 08:36 , Alexey Zakhlestin notifications@github.com wrote:
what would be a start and end of session? SPARQL http://www.w3.org/TR/2013/REC-sparql11-protocol-20130321/ is a stateless protocol and all requests are separated from each other
— Reply to this email directly or view it on GitHub https://github.com/openlink/virtuoso-opensource/issues/370#issuecomment-91216668.
As requirements of different applications differ, I'd try to use a custom dba-owned stored procedure that creates a new graph (say, UUID-based name whould be ok), grants the permissions on it and remember its name in some connection variable. Another function may be used to share the graph between different connections, with the desired security checks inside the function. Later, on disconect, a custom DBEV_DISCONNECT() event handler function should inspect the connection variable and drop the graph. That might be flexible enough, unlike any single built-in method I can imagine.
Yes, the custom stored procedure approach has its merits, but it would require the developer to learn the internals of a particular triple store, such as Virtuoso. I believe the creation of temporary graphs should be supported natively via an extension of the SPARQL CREATE GRAPH statement, just as the creation of temporary tables is supported via extensions of the SQL CREATE TABLE statement in many relational database management systems. The scenarios where temporary graphs can greatly simplify the implementation of application logic are actually quite common. If you are interested, I can show you a couple of examples from my own experience in bio- and cheminformatics applications, but the general usage pattern is described, for example, in the Temporary Tables section of this Oracle doc page: http://docs.oracle.com/cd/B28359_01/server.111/b28310/tables003.htm#ADMIN11633 (they are talking about temporary tables there, but the idea translates to temporary graphs in the RDF world).
but it would require the developer to learn the internals of a particular triple store
@drassokhin but you're still proposing a custom extension (temporary table) to non-standard protocol (sparql-over-odbc). it's pretty much the same
If you want an interoperable solution, you should:
On 4/10/15 6:40 PM, drassokhin wrote:
Yes, the custom stored procedure approach has its merits, but it would require the developer to learn the internals of a particular triple store, such as Virtuoso. I believe the creation of temporary graphs should be supported natively via an extension of the SPARQL CREATE GRAPH statement, just as the creation of temporary tables is supported via extensions of the SQL CREATE TABLE statement in many relational database management systems. The scenarios where temporary graphs can greatly simplify the implementation of application logic are actually quite common. If you are interested, I can show you a couple of examples from my own experience in bio- and cheminformatics applications, but the general usage pattern is described, for example, in the Temporary Tables section of this Oracle doc page: http://docs.oracle.com/cd/B28359_01/server.111/b28310/tables003.htm#ADMIN11633 (they are talking about temporary tables there, but the idea translates to temporary graphs in the RDF world). How have you arrived at the conclusion that CREATE GRAPH isn't supported by Virtuoso?
Why do you think CREATE GRAPH == CREATE TABLE? A Named Graph is an RDF DBMS hosted Document comprised of RDF Statements representing Relations. A SQL Table is a Relation, it is stored in a SQL RDBMS hosted document too, but that document isn't typically exposed to the SQL RDBMS user.
Anyway, here is a simple testsuite re., Virtuoso's support of SPARQL Update, Insert, Delete, and Graph creation functionality:
DROP GRAPH urn:sparql:rww:qa:tests:data ;
CREATE GRAPH urn:sparql:rww:qa:tests:data ;
INSERT {GRAPH urn:sparql:rww:qa:tests:data {<#doc> a foaf:Document; foaf:primaryTopic http://kingsley.idehen.net/dataspace/person/kidehen#this } }
COPY GRAPH urn:sparql:rww:qa:tests:data TO GRAPH urn:sparql:11:qa:tests2:data
MOVE GRAPH urn:sparql:rww:qa:tests:data TO GRAPH urn:sparql:11:qa:tests3:data
SELECT * FROM urn:sparql:rww:qa:tests:data WHERE {?s ?p ?o}
SELECT * FROM urn:sparql:11:qa:tests3:data WHERE {?s ?p ?o}
WITH GRAPH urn:sparql:rww:qa:tests:data INSERT {<#kidehen> owl:sameAshttp://kingsley.idehen.net/dataspace/person/kidehen#this}
urn:sparql:11:qa:tests3:data using: ADD GRAPH urn:sparql:rww:qa:tests:data TO GRAPH urn:sparql:rww:qa:tests2:data
SELECT * FROM urn:sparql:rww:qa:tests2:data WHERE {?s ?p ?o}
Regards,
Kingsley Idehen Founder & CEO OpenLink Software Company Web: http://www.openlinksw.com Personal Weblog 1: http://kidehen.blogspot.com Personal Weblog 2: http://www.openlinksw.com/blog/~kidehen Twitter Profile: https://twitter.com/kidehen Google+ Profile: https://plus.google.com/+KingsleyIdehen/about LinkedIn Profile: http://www.linkedin.com/in/kidehen Personal WebID: http://kingsley.idehen.net/dataspace/person/kidehen#this
I guess my suggestion was misinterpreted. I never said that the current version of Virtuoso does not support CREATE GRAPH. Neither did I say that CREATE GRAPH == CREATE TABLE. All that I am saying is that it would be very convenient to have an extension of the CREATE GRAPH statement that would create a temporary graph with automatic scope and lifetime management (please see my original post). Just to illustrate the issue, I am attaching a relatively large code snippet in C# that does the following: 1) Given a structure ID, searches for chemically similar structures using a chemistry cartridge for MS SQL Server (note the use of temporary tables prefixed with # in SQL Server). 2) Creates a temporary graph in a Virtuoso triple store and populates it with the results of the similarity search. 3) Searches the ChEMBL dataset stored in the chembl19 graph for biological activities of similar compounds found in step 1.
Note that the code has to explicitly manage the naming and lifetime of the temporary graph. Also note that the code has to use the highly privileged dba login so it can create the temporary graph (again, see my original post for details of the issue).
void Main()
{
// Find ChEMBL activity data on the specified jnj compound and similar structures.
int jnjNumber = 1923493; // Topamax
var qq = from s in StageSharedStructures where s.JNJ_NUMBER == jnjNumber select s.SMILES;
string targetSmiles = qq.First();
// Find similar structures
string q = @"
declare @cutoff float;
declare @smiles varchar(max);
declare @fpSearch varbinary(max);
set @cutoff = 0.60;
set @smiles = '{0}';
set @fpSearch = chem.ComputeFpStructureStr(@smiles, 'EFcfp4', 0);
select Id,
chem.BitsetSimilarityTanimoto(@fpSearch, FpEFcfp4) as Similarity
into #matchedStructId
from chem.FPEX_chem_Structure_EncodedMolecule
where chem.BitsetSimilarityTanimoto(@fpSearch, FpEFcfp4) >= @cutoff;
select s.*, m.Similarity from chem.StructureChemblAbcdExV s join #matchedStructId m on m.Id = s.Id where ChemblId is not null order by m.Similarity desc, s.SmilesHash;
";
q = string.Format(q, targetSmiles);
var r = this.ExecuteQuery<StructureChemblAbcdExSimV>(q);
string usr = "dba"; //"chemgen_reader" won't work;
string pwd = "dba"; //"chemgen_reader";
const string virtuosoHost = "virtjnjxxx";
VirtuosoManager virtuoso = new VirtuosoManager(virtuosoHost, VirtuosoManager.DefaultPort, VirtuosoManager.DefaultDB, usr, pwd, 100);
string tempGraphName = string.Format(@"tmp-xx{0}", DateTime.Now.Ticks); // collision unlikely
virtuoso.Update(string.Format("CREATE GRAPH <{0}>", tempGraphName));
StringBuilder sb = new StringBuilder();
sb.AppendFormat("INSERT DATA IN <{0}> {{ ", tempGraphName); // {{!!!
foreach (var rr in r) sb.AppendFormat("<http://rdf.ebi.ac.uk/resource/chembl/molecule/{0}> <tanimoto> {1} .\r\n", rr.ChemblId, rr.Similarity);
sb.Append(" }");
string iq = sb.ToString();
virtuoso.Update(iq);
try
{
q =
@"
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX cco: <http://rdf.ebi.ac.uk/terms/chembl#>
PREFIX cheminf: <http://semanticscience.org/resource/>
PREFIX chembl_molecule: <http://rdf.ebi.ac.uk/resource/chembl/molecule/>
SELECT
?tanimotoSim ?molecule ?molName ?smiles ?atc ?activity ?pchembl ?assay ?assayDesc ?target ?targetcmpt ?uniprot ?targetCompDescription
from <chembl19>
from named <zzzzzzzzz>
WHERE
{
{
?activity a cco:Activity ;
cco:hasMolecule ?molecule ;
cco:hasAssay ?assay ;
cco:pChembl ?pchembl .
?assay dcterms:description ?assayDesc .
?molecule rdfs:label ?molName;
cheminf:SIO_000008 ?ss . # Drug has attribute: http://semanticscience.org/resource/SIO_000008
optional { ?molecule cco:atcClassification ?atc . } # Add ATC classification code from ChEMBL ontology: https://www.ebi.ac.uk/rdf/services/chembl/describe?uri=http%3A%2F%2Frdf.ebi.ac.uk%2Fterms%2Fchembl%23atcClassification
?ss cheminf:SIO_000300 ?smiles. # That attribute is a value: http://semanticscience.org/resource/SIO_000300
?ss a cheminf:CHEMINF_000018. # That value is a SMILES representation of the structure: http://semanticscience.org/resource/CHEMINF_000018
?assay cco:hasTarget ?target .
?target cco:hasTargetComponent ?targetcmpt .
?targetcmpt cco:targetCmptXref ?uniprot .
?targetcmpt dcterms:description ?targetCompDescription .
?uniprot a cco:UniprotRef .
{graph <zzzzzzzzz> {?molecule <tanimoto> ?tanimotoSim .}}
}
}
limit 500
";
q = q.Replace("zzzzzzzzz", tempGraphName); // kludgy, but better than using Format with escaped {}'s
var res = (SparqlResultSet)virtuoso.Query(q);
var res1 = res.Select(rr => new Res
{
TanimotoSim = GetDouble(rr, "tanimotoSim"),
ActivityUri = GetUri(rr, "activity"),
AssayDescription = GetString(rr, "assayDesc"),
AssayUri = GetUri(rr, "assay"),
Atc = GetString(rr, "atc"),
MolName = GetString(rr, "molName"),
MolUri = GetUri(rr, "molecule"),
pChembl = GetDouble(rr, "pchembl"),
Smiles = GetString(rr, "smiles"),
TargetComponent = GetUri(rr, "targetcmpt"),
TargetComponentDescription = GetString(rr, "targetCompDescription"),
TargetUri = GetUri(rr, "target"),
UniprotUri = GetUri(rr, "uniprot")
}
).OrderByDescending(rrr => rrr.TanimotoSim);
res1.Dump();
string tmpPath0 = Path.GetTempFileName();
string tmpPath = tmpPath0 + ".csv";
File.Move(tmpPath0, tmpPath);
//Util.ToCsvString(res1, "ActivityUri").Dump();
Util.WriteCsv(res1, tmpPath,
"TanimotoSim",
"MolUri",
"MolName",
"Smiles",
"Atc",
"ActivityUri",
"pChembl",
"AssayUri",
"AssayDescription",
"TargetUri",
"TargetComponent",
"UniprotUri",
"TargetComponentDescription");
Process.Start(@"c:\3dx\3dx.exe", tmpPath);
}
finally
{
virtuoso.Update(string.Format("DROP GRAPH <{0}>", tempGraphName));
}
}
static Uri GetUri(SparqlResult sr, string key)
{
INode val;
if (!sr.TryGetBoundValue(key, out val)) return null;
return ((IUriNode)val).Uri;
}
static double? GetDouble(SparqlResult sr, string key)
{
INode val;
if (!sr.TryGetBoundValue(key, out val)) return null;
return val.AsValuedNode().AsDouble();
}
static string GetString(SparqlResult sr, string key)
{
INode val;
if (!sr.TryGetBoundValue(key, out val)) return null;
return ((ILiteralNode)val).Value;
}
class Res
{
public double? TanimotoSim {get; set;}
public Uri MolUri {get; set;}
public string MolName {get; set;}
public string Smiles {get; set;}
public string Atc {get; set;}
public Uri ActivityUri {get; set;}
public double? pChembl {get; set;}
public Uri AssayUri {get; set;}
public string AssayDescription {get; set;}
public Uri TargetUri {get; set;}
public Uri TargetComponent { get; set; }
public Uri UniprotUri {get; set;}
public string TargetComponentDescription {get; set;}
}
On 4/13/15 11:03 AM, drassokhin wrote:
I guess my suggestion was misinterpreted. I never said that the current version of Virtuoso does not support CREATE GRAPH. Neither did I say that CREATE GRAPH == CREATE TABLE. All that I am saying is that it would be very convenient to have an extension of the CREATE GRAPH statement that would create a temporary graph with automatic scope and lifetime management (please see my original post). Just to illustrate the issue, I am attaching a relatively large code snippet in C# that does the following: 1) Given a structure ID, searches for chemically similar structures using a chemistry cartridge for MS SQL Server (note the use of temporary tables prefixed with # in SQL Server). 2) Creates a temporary graph in a Virtuoso triple store and populates it with the results of the similarity search. 3) Searches the ChEMBL dataset stored in the chembl19 graph for biological activities of similar compounds found in step 1.
Note that the code has to explicitly manage the naming and lifetime of the temporary graph. Also note that the code has to use the highly privileged dba login so it can create the temporary graph (again, see my original post for details of the issue).
void Main() { // Find ChEMBL activity data on the specified jnj compound and similar structures. int jnjNumber =1923493;// Topamax
var qq =from sin StageSharedStructureswhere s.JNJ_NUMBER == jnjNumberselect s.SMILES; string targetSmiles = qq.First(); // Find similar structures string q = @"
declare @cutoff float; declare @smiles varchar(max); declare @fpSearch varbinary(max);
set @cutoff = 0.60; set @smiles = '{0}'; set @fpSearch = chem.ComputeFpStructureStr(@smiles, 'EFcfp4', 0);
select Id, chem.BitsetSimilarityTanimoto(@fpSearch, FpEFcfp4) as Similarity into #matchedStructId from chem.FPEX_chem_Structure_EncodedMolecule where chem.BitsetSimilarityTanimoto(@fpSearch, FpEFcfp4) >= @cutoff; select s.*, m.Similarity from chem.StructureChemblAbcdExV s join #matchedStructId m on m.Id = s.Id where ChemblId is not null order by m.Similarity desc, s.SmilesHash; ";
q =string.Format(q, targetSmiles); var r =this.ExecuteQuery<StructureChemblAbcdExSimV>(q); string usr ="dba";//"chemgen_reader" won't work; string pwd ="dba";//"chemgen_reader"; const string virtuosoHost ="virtjnjxxx"; VirtuosoManager virtuoso =new VirtuosoManager(virtuosoHost, VirtuosoManager.DefaultPort, VirtuosoManager.DefaultDB, usr, pwd,100); string tempGraphName =string.Format(@"tmp-xx{0}", DateTime.Now.Ticks);// collision unlikely virtuoso.Update(string.Format("CREATE GRAPH <{0}>", tempGraphName)); StringBuilder sb =new StringBuilder(); sb.AppendFormat("INSERT DATA IN <{0}> {{", tempGraphName);// {{!!! foreach (var rrin r) sb.AppendFormat("<http://rdf.ebi.ac.uk/resource/chembl/molecule/{0}> <tanimoto> {1} .\r\n", rr.ChemblId, rr.Similarity); sb.Append(" }"); string iq = sb.ToString(); virtuoso.Update(iq);
try { q = @" PREFIX rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# PREFIX rdfs: http://www.w3.org/2000/01/rdf-schema# PREFIX owl: http://www.w3.org/2002/07/owl# PREFIX xsd: http://www.w3.org/2001/XMLSchema# PREFIX dc: http://purl.org/dc/elements/1.1/ PREFIX dcterms: http://purl.org/dc/terms/ PREFIX foaf: http://xmlns.com/foaf/0.1/ PREFIX skos: http://www.w3.org/2004/02/skos/core# PREFIX cco: http://rdf.ebi.ac.uk/terms/chembl# PREFIX cheminf: http://semanticscience.org/resource/ PREFIX chembl_molecule: http://rdf.ebi.ac.uk/resource/chembl/molecule/
SELECT ?tanimotoSim ?molecule ?molName ?smiles ?atc ?activity ?pchembl ?assay ?assayDesc ?target ?targetcmpt ?uniprot ?targetCompDescription from
from named WHERE { { ?activity a cco:Activity ; cco:hasMolecule ?molecule ; cco:hasAssay ?assay ; cco:pChembl ?pchembl . ?assay dcterms:description ?assayDesc . ?molecule rdfs:label ?molName; cheminf:SIO_000008 ?ss . # Drug has attribute: http://semanticscience.org/resource/SIO_000008 optional { ?molecule cco:atcClassification ?atc . } # Add ATC classification code from ChEMBL ontology: https://www.ebi.ac.uk/rdf/services/chembl/describe?uri=http%3A%2F%2Frdf.ebi.ac.uk%2Fterms%2Fchembl%23atcClassification ?ss cheminf:SIO_000300 ?smiles. # That attribute is a value: http://semanticscience.org/resource/SIO_000300 ?ss a cheminf:CHEMINF_000018. # That value is a SMILES representation of the structure: http://semanticscience.org/resource/CHEMINF_000018 ?assay cco:hasTarget ?target . ?target cco:hasTargetComponent ?targetcmpt . ?targetcmpt cco:targetCmptXref ?uniprot . ?targetcmpt dcterms:description ?targetCompDescription . ?uniprot a cco:UniprotRef . {graph <zzzzzzzzz> {?molecule <tanimoto> ?tanimotoSim .}} }
}
limit 500 "; q = q.Replace("zzzzzzzzz", tempGraphName);// kludgy, but better than using Format with escaped {}'s
var res = (SparqlResultSet)virtuoso.Query(q); var res1 = res.Select(rr =>new Res { TanimotoSim = GetDouble(rr,"tanimotoSim"), ActivityUri = GetUri(rr,"activity"), AssayDescription = GetString(rr,"assayDesc"), AssayUri = GetUri(rr,"assay"), Atc = GetString(rr,"atc"), MolName = GetString(rr,"molName"), MolUri = GetUri(rr,"molecule"), pChembl = GetDouble(rr,"pchembl"), Smiles = GetString(rr,"smiles"), TargetComponent = GetUri(rr,"targetcmpt"), TargetComponentDescription = GetString(rr,"targetCompDescription"), TargetUri = GetUri(rr,"target"), UniprotUri = GetUri(rr,"uniprot") } ).OrderByDescending(rrr => rrr.TanimotoSim); res1.Dump(); string tmpPath0 = Path.GetTempFileName(); string tmpPath = tmpPath0 +".csv"; File.Move(tmpPath0, tmpPath); //Util.ToCsvString(res1, "ActivityUri").Dump(); Util.WriteCsv(res1, tmpPath,
"TanimotoSim", "MolUri", "MolName", "Smiles", "Atc", "ActivityUri", "pChembl", "AssayUri", "AssayDescription", "TargetUri", "TargetComponent", "UniprotUri", "TargetComponentDescription");
Process.Start(@"c:\3dx\3dx.exe", tmpPath); } finally { virtuoso.Update(string.Format("DROP GRAPH <{0}>", tempGraphName)); }
}
static Uri GetUri(SparqlResult sr,string key) { INode val; if (!sr.TryGetBoundValue(key, out val))return null; return ((IUriNode)val).Uri; }
static double? GetDouble(SparqlResult sr,string key) { INode val; if (!sr.TryGetBoundValue(key, out val))return null; return val.AsValuedNode().AsDouble(); }
static string GetString(SparqlResult sr,string key) { INode val; if (!sr.TryGetBoundValue(key, out val))return null; return ((ILiteralNode)val).Value; }
class Res { public double?TanimotoSim {get;set;} public UriMolUri {get;set;} public string MolName {get;set;} public string Smiles {get;set;} public string Atc {get;set;} public UriActivityUri {get;set;} public double?pChembl {get;set;} public UriAssayUri {get;set;} public string AssayDescription {get;set;} public UriTargetUri {get;set;} public UriTargetComponent {get;set; } public UriUniprotUri {get;set;} public string TargetComponentDescription {get;set;}
}
You are requesting a CREATE GRAPH extension that adds a duration argument. Basically, when such a graph is created it is automatically deleted by the system once its exceeds duration, right?
Regards,
Kingsley Idehen Founder & CEO OpenLink Software Company Web: http://www.openlinksw.com Personal Weblog 1: http://kidehen.blogspot.com Personal Weblog 2: http://www.openlinksw.com/blog/~kidehen Twitter Profile: https://twitter.com/kidehen Google+ Profile: https://plus.google.com/+KingsleyIdehen/about LinkedIn Profile: http://www.linkedin.com/in/kidehen Personal WebID: http://kingsley.idehen.net/dataspace/person/kidehen#this
No, I am not requesting an extended CREATE GRAPH with a duration argument. I am requesting an extension to CREATE GRAPH that would create a temporary graph with automatic visibility scope and lifetime management.
An extended CREATE GRAPH should be able to create temporary graphs of the following two types: local and global. A local temporary graph should be visible only to the user who created it (but only in the current session) and automatically deleted at the end of the session in which that graph has been created. A global temporary graph should be visible to any user and any connection after it is created, and is deleted when all users referencing the graph disconnect from the instance of Virtuoso. Note that these types of temporary graphs make sense only in the context of a connection session when connections to Virtuoso instances are created programmatically via native Virtuoso drivers (ODBC, JDBC and ADO.NET) rather than the standard stateless (and therefore session-less) SPARQL protocol.
Add support for temporary graphs with behavior similar to that of temporary tables in many popular relational database management systems such as MS SQL Server or Oracle. For example, an extension of the CREATE GRAPH statement can use a reserved graph name prefix that would have the effect similar to MS SQL Server temporary table prefix (# or ##). That is, CREATE GRAPH <#temp_graph_name> would create a named graph visible only to the user who created it (but only in the current session) and automatically deleted at the end of the session in which that graph has been created. The following statement: CREATE GRAPH <##temp_graph_name> would create a temporary graph visible to any user and any connection after it is created, and is deleted when all users referencing the graph disconnect from the instance of Virtuoso.
Temporary graphs come handy when designing triple-store-backed applications and services.
Suppose a user called "data_reader" is designated as the service account under which an application programmatically connects to the triple store. Suppose that the application needs to create a temporary graph (whose name is not known in advance but is generated by the application so it would be unlikely to collide with names of any graphs already existing in the store) and insert some triples into it in order to perform a query combining the data in the temporary triples with the data already stored in other graphs in the same triple store.
Without built-in support for temporary graphs, there are three problems the application developer faces:
1) The application itself is responsible for deleting temporary graphs it creates.
2) The application has to guarantee that the name of a temporary graph it creates won't collide with names of any other graphs in the same triple store.
3) The service account permission issue: If access is disabled to all graphs for all users (or they are given read-only access to all graphs by executing DB.DBA.RDF_DEFAULT_USER_PERMS_SET ('nobody', 1);) and then the SPARQL_SELECT role is assigned to data_reader for some or all graphs in the store, the CREATE GRAPH < tmp-xx635374837152112137> statement run using the connection opened under the data_reader account will fail with the following message (SPARUL CREATE GRAPH access denied: database user 110 (data_reader) has no write permission on graph tmp-xx635374837152112137). So, with the permissions it has, the data_reader user simply cannot create a new graph. The only available solution currently is to use a privileged account with permissions to read and create any graphs as the application service account, which definitely opens a security hole.