Closed nathandunn closed 3 years ago
The problem is that after uploading the first one, I can't really upload a subsequent one as individual keys have a limit.
Some possible solutions would be to remove unused data (e.g., sample names), which gets us a bit further, but not a great solution. We could also use individuals keys, which is also problematic.
I think if we wanted to store analysis results for GMT files, this would be a preferred method, especially as I think GMT files are easy to create, and fun to analyze. Also, hallmark is 50 gene sets. When we are looking at 5K, it quite simply won't work. I think the limit is 5200000 characters https://stackoverflow.com/a/61018107/1739366. On the flipptity, if we paired down the data and set a hard limit on geneset size, but this won't scale at all.
Looking at #644 I think we need a more robust solution. Its a bit wonky, but I feel as though it provides a more robust solution.
We would then have:
client -> analysis-storage-server / R-server -> xena
In this case we wouldn't hit the R-server directly ever and would likely just do an exec.
In terms of the architecture, for storage, we are creating keys into JSON blobs right now. data type / analysis method 1- geneset list 1- geneset results
https://docs.google.com/spreadsheets/d/1oU6YG_7JK6_qBuTNdkXRw6GGiX3vQpzg3f47t84rJAA/edit#gid=1556857962. (sauce: https://github.com/pouchdb/pouchdb/issues/4031#issuecomment-241243773)
. . . pouchdb might be intesreting here.
Getting this error:
I think this is related to #644 in that we quite simply need a backend store to make this work.