bmeg / bmeg-etl

ETL configuration for BMEG
1 stars 2 forks source link

V([big_list]) produces no results ? #333

Closed bwalsh closed 5 years ago

bwalsh commented 5 years ago

There might be a silent server side error when a large list of gids are passed to V()

Create a long list of alleles

case_alleles = list(set([p for p in O.query().V(unique_aliquots).as_("aliquot").in_("CallsetFor").out("AlleleCall").as_("allele").render("$allele._gid")]))
print('There are {} alleles associated with cases'.format(len(case_alleles)))

>>>>  [INFO]    2019-06-14 13:02:35,930 1,208,510 results received in 228 seconds
There are 560699 alleles associated with cases

No hits are returned

case_allele_g2p = list(set([p for p in O.query().V(case_alleles).as_("allele").in_("HasAlleleFeature").as_("g2p").render("$g2p._gid")]))
print('There are {} alleles associated with case alleles'.format(len(case_allele_g2p)))
​
>>>> There are 0 alleles associated with case alleles

Try it with the first 50 alleles

case_allele_genes = list(set([p for p in O.query().V(case_alleles[:50]).as_("allele").out("AlleleIn").as_("gene").render("$gene._gid")]))
print('There are {} genes associated with case alleles ({})'.format(len(case_allele_genes), len(case_alleles[:50])))

>>>> [INFO] 2019-06-14 14:03:46,498 50 results received in 0 seconds
There are 50 genes associated with case alleles (50)
adamstruck commented 5 years ago

I am seeing the following types errors in the server logs:

{
  "msg": MongoDb: iterating results",
  "error":  "BSONObj size: 33531625 (0x1FFA6E9) is invalid. Size must be between 0 and 16793600(16MB) First element: aggregate: \"bwalsh-test_vertices\""
}

The input BSON query is larger than the max (16MB) that mongo allows.

bwalsh commented 5 years ago

Thanks. Would it be useful to return that error to the client(s) ?

adamstruck commented 5 years ago

Fixed in https://github.com/bmeg/grip/pull/207