populationgenomics / hail

Scalable genomic data analysis.
https://hail.is
MIT License
1 stars 1 forks source link

Expose job resource in batch API #329

Closed illusional closed 6 months ago

illusional commented 6 months ago

It's useful for us to be able to programmatically access the resource data from a job and batches. This is stored as a dataframe, we'll send back as json with orient='split'.

I've tested this in a dev deploy, and worked fairly well.

In my own interest, here's how to convert it back into a dataframe:

import pandas as pd

response = {} # response from Hail Batch
dataframes = {
    key: pd.DataFrame(data=values['data'], columns=values['columns'])
    for key, values in response.items()
}
illusional commented 6 months ago

I have suggested this upstream, as it reduces the amount of conflicts we might receive, and might be useful for them: https://github.com/hail-is/hail/pull/14328