VirtualFlyBrain / vfb-pipeline-dumps

Pipeline that creates dumps from the triplestore for consumption by the downstream services
Apache License 2.0
0 stars 0 forks source link

FlyBase:FBrf ?xrefs? should not be copied over as synonyms into SOLR #9

Closed dosumis closed 4 years ago

dosumis commented 4 years ago
"short_form":"FBbt_00003748",
 "synonym":["FlyBase:FBrf0224194",
          "ME",
          "FlyBase:FBrf0212704",
          "medulla",
          "Med",
          "m",
          "FlyBase:FBrf0212889",
          "optic medulla"],

Don't think xrefs on synonyms belong here.

Maybe this is the relevant code? https://github.com/VirtualFlyBrain/vfb-pipeline-dumps/blob/master/scripts/obographs-solr.py#L158

matentzn commented 4 years ago

So if you don't want xrefs of synonyms show up at all, is there a reason not to just remove it from the code as you indicate?

if 'xrefs' in syn:
    se["synonym"].extend(syn['xrefs'])
    se["synonym_autosuggest"].extend(syn['xrefs'])
dosumis commented 4 years ago

Don't see a use case for xrefs showing up in autocomplete, so yes, please remove.

Robbie1977 commented 4 years ago

@matentzn FYI this is a lower priority issue. @dosumis do we not want xrefs for the SkIds, bodyIds, etc.?

dosumis commented 4 years ago

@matentzn FYI this is a lower priority issue.

Isn't it important to ensure we're not indexing terms by the xrefs on their synonyms for autocomplete? Happy to just do the edit myself.

@dosumis do we not want xrefs for the SkIds, bodyIds, etc.?

That should be a separate job - to pull IDs from xref edges to Site nodes. @matentzn how would I specify that? Add some sparql to the query that generated JSON input & then edit this dump script to add?

matentzn commented 4 years ago

@dosumis lets make a new ticket for that; I am not sure top of my head; but yes, if you want to give it a shot; go to http://ts.p2.virtualflybrain.org/rdf4j-workbench/repositories/vfb/query play with a SELECT query until you find the required information, then add a construct query to the sparql directory as you previously did. I have generalised (and documented) the SPARQL construct pipeline so you will see what to do when you open the Makefile! Let me know if you need help!

matentzn commented 4 years ago

This issue is done and checked by @Robbie1977 - can be closed.