Closed jimmymathews closed 1 month ago
The actual problem seems to be that the query used to pull out feature data for a random subselection requires too much temporary storage on the postgres server. A fix for this could be to randomly subselect integer indices, and pull them out manually from those binary-format feature matrix payloads (since we can query these quickly), once we add the option to include the continuous intensity values there.
The current behavior for generating UMAPs requires a somewhat limited dataset due to RAM strain on postgres. Fix this by doing random subselection more manually, then pulling only what is needed.