ropensci / elastic

R client for the Elasticsearch HTTP API
https://docs.ropensci.org/elastic
Other
244 stars 58 forks source link

Complex queries using AND and OR #172

Closed komalsrathi closed 7 years ago

komalsrathi commented 7 years ago

How do I do complex queries where I have to search by two fields? My query is below but I think somehow it is incorrect because it does not work. It extracts information matching to gene_info.symbol but does not return for that one sample id, instead it returns for all sample ids:

# fields to return
> print(body)
"{\"_source\": [\"gene_info.symbol\",\"samples.sample_id\",\"samples.rsem.fpkm\",\"gene_info.biotype\"]}"

# fields to query
> print(query)
"gene_info.symbol:(A1BG OR PDK1) AND samples.sample_id:(C021_0001_20140916_tumor_RNASeq)"

out <- Search(index = myindexname, 
              type = typename, 
              q = query, 
              body = body, 
              raw = TRUE)

This is the structure of the JSON:

    "hits" : [
      {
        "_index" : "pnoc",
        "_type" : "genes",
        "_id" : "ENSG00000245105.2",
        "_score" : 11.119687,
        "_source" : {
          "gene_info" : {
            "biotype" : "antisense",
            "transcripts" : [
              {
                "biotype" : "antisense",
                "entrez_id" : [
                  "144571"
                ],
                "transcript_id" : "ENST00000499762.2",
                "end" : "9068060",
                "refseq_protein_id" : [ ],
                "start" : "9065177",
                "refseq_mrna_id" : [
                  "NR_026971.1"
                ]
              }
            ],
            "symbol" : "A2M-AS1",
            "chr" : "chr12",
            "end" : "9068060",
            "strand" : "+",
            "start" : "9065177"
          },
          "samples" : [
            {
              "sample_id" : "C021_0001_20140916_tumor_RNASeq",
              "rsem" : {
                "expected_count" : 281.37,
                "effective_length" : 2005.12,
                "fpkm" : 1.14,
                "length" : 2192.0,
                "tpm" : 1.43
              },
...

Using my query, I am unable to retrieve the specific samples.

komalsrathi commented 7 years ago

I think this is the problem with the structure of the database.

sckott commented 7 years ago

let me know if i can help