Open dcherian opened 1 year ago
Xarray is a great tool for Neuroscience research since we typically gather data involving multiple dimensions (trials, days, animas, conditions etc.) Allen Institute provides an SDK for reading and processing such data alognwith an "observatory" which contains relevant data (https://allensdk.readthedocs.io/en/latest/)
Hello @rsatapat, can we add a subset of the data to xaray-data for future tutorials? Any concerns regarding a subset of data being added for tutorials?
Relevant content from @jsiegle: https://xarray.dev/blog/xarray-for-neurophysiology
Just keeping a list of some other examples here
Already using Xarray:
Would require modification to use xarray instead of numpy or custom objects:
Would be interesting to look at modifying some of these examples to see if Xarray would work well in place of straight numpy arrays https://numpy.org/numpy-tutorials/ ... also it's an excellent repository overall
Brainstormed a bit more on this today with @TomNicholas. There are really two separate things to accomplish:
Note: On one hand it's nice to re-use the existing graphic and actual dataset, but could simplify even further by reducing the size, adding dimension labels to the image on the left, and dropping "alleles" and running set_index() to the dataarray on the right to easily match up!
https://docs.google.com/forms/d/1x9bOIelnUsDMyI1tF4bN7TWK0v4nBDiwhpxh9mi6PaI/edit#responses
One of the user survey responses specifically calls this out:
Examples with Astropy to read FITS files, using Astropy Tables
Examples with Astropy to read FITS files, using Astropy Table
Some renewed activity in this repository that seems relevant! https://github.com/ratt-ru/xarray-fits/issues/26
@tomwhite mentioned that the sgkit file openers / converters are actually about to be deprecated in favour of a new package called bio2zarr
. Basically their motivation is that the text-based VCF format etc. is so awfully-designed that efficient access via a kerchunk-like approach is basically impossible, so they end up having to convert it to zarr anyway.
@tomwhite mentioned that the sgkit file openers / converters are actually about to be deprecated in favour of a new package called
bio2zarr
. Basically their motivation is that the text-based VCF format etc. is so awfully-designed that efficient access via a kerchunk-like approach is basically impossible, so they end up having to convert it to zarr anyway.
Both the VCF conversion code in sgkit and the new bio2zarr project both output the same Zarr format (specified here). The reason for bio2zarr
is that users were struggling to get the Dask-based sgkit VCF conversion working reliably, so the code was re-written to be a command-line application that runs on multi-core local machines, or HPC schedulers, and bio2zarr
is the result.
There are a couple of example sgkit tutorials that may be of interest here: https://sgkit-dev.github.io/sgkit/latest/examples/index.html
Example: