JuliaNeuroscience / NeuroCore.jl

Core methods and structures for neuroscience research in Julia.
MIT License
17 stars 3 forks source link

Roadmap #4

Open Tokazama opened 4 years ago

Tokazama commented 4 years ago

General Approach

In the absence of a formal write up (which I will upload shortly before JuliaCon) the strategy for development throughout this org is this: Write the code we need to do neuroscience research and do it well. Any code that ends up being useful to a wider community (e.g., plotting, stats, etc.) may eventually become part of a different ecosystem (e.g., JuliaPlots, JuliaStats, etc.).

From a pure user perspective this means that if code move from this org to another it won't be noticeable. From a developer perspective it means that code developed here might end up somewhere else eventually if it means it will be maintained by a more appropriately focused team of developers. Therefore, we use the commonly used MIT license to make this sort of thing easy and we play nice with our peers.

For example, at the time of writing this I have a substantial PR providing data type (time, space, observation, etc.) methods. Instead of waiting for the entire Julia community to agree upon a standard for referring to time data we can move the code to a broader community once it appears stable and useful. This means some packages may end up simply end up being short scripts that bind a variety of packages together in a useful way (see Makie.jl for an example) or just formal documentation on how to perform analyses with a few convenience functions for learning (see StatisticalRethinking.jl).

The end result should look the same to users, a Julian approach to neuroinformatics.

Public Release

These are things that need to be taken care of before a wider public release of the package can happen.

File Format Support

I plan on getting the following supported (mostly because I regularly use these):

Here are some other formats that may be worth supporting but would require someone else to take the lead.

Type Interfaces

AbstractArray support

We need an AbstractArray subtype with the following features:

  1. Flexible metadata storage
  2. Named dimensions
  3. Indexing by units/keys

This is largely accomplished by AxisIndices.jl. I've been heavily testing it and hope to have it at a point where we can simply perform things like PCA and ICA on types present through this.

Geometry Types

I'm leaning towards using GeometryTypes.jl, which will provide support for plotting with very little effort.

Graph Types

Connectomes and potentially data access patterns.

Plotting and Visualization

The backend for this will likely be all or mostly Makie.jl.

Note that work for this specific objective will start rolling out after March 16th (after VizCon 2). This will ensure that developments in this area are in harmony greater harmony with future directions of Julia's various plotting ecosystems.

timholy commented 4 years ago

WRT an AxisArrays replacement, should we evaluate DimensionalData.jl? Both DimensionalData and NamedDims.jl happen to be standing at 227 commits, but the current effort being spent in DD is quite high.

Tokazama commented 4 years ago

should we evaluate DimensionalData.jl?

tl;dr I'm open to whatever solution the wider community ends up adopting (particularly JuliaImages) but I currently think that NamedDims has some big advantages

Although DimensionalData.jl has good performance I think NamedDims.jl has had thought put into every single method's performance. For example you'll find this comment peppered throughout the code 0-Allocations see:@btime (()->dim((:a, :b), :b))()`. Part of this may just be that more people seem to be contributing to NameDims.jl in discussion and PRs.

I have a lot of nit picky things about its design that I don't like. Most of my issue with it stem from it requiring users to specify dedicated types for each dimension. I think this results in a lot of unfriendly syntax and I suspect that it could lead to more burdensome maintenance in the future if it's widely adopted (if two different packages define the Time dimension type then everything breaks when those two packages are loaded).

I admit that a lot of these reasons aren't completely concrete (ugly syntax to one is beautiful Python syntax to another 😉). Therefore, if the consensus ends being that DimensionalData is the way to go then I will fully support that.

One last point. I think the problem with indexing could be solved relatively quickly once people agree and get behind a single approach. I really like what @mcabbott and I are circling around at the end of https://github.com/JuliaCollections/AxisArraysFuture/issues/1. I have most of the code implemented for what I've proposed and I'm starting to right some more examples here.

timholy commented 4 years ago

Good to know. Until this morning I hadn't realized I wasn't watching NamedDims.jl, so I missed all those discussions. Maybe a good thing though, I've not been ready yet to tackle this issue with the seriousness it deserves, so best if others take the lead. But some of our other cleanups are getting sufficiently complete that this one is rising higher in the priority queue.

Tokazama commented 4 years ago

I just pushed some more examples and benchmarks to the last link I referenced. It's not pretty enough for a formal package, but I've tried to explore a lot of generic array interfaces as far as I can reasonably take it (e.g., how would it look to perform cat append!, push!, etc.). The trouble I've found with a lot of this is it's hard to explain why a certain approach might be better without getting a working demonstration, so hopefully this will prove helpful as people have time to look at it.

I've not been ready yet to tackle this issue with the seriousness it deserves

I appreciate any contributions you're able to make. I understand you are very busy with other things.

As for the rest of the roadmap (or lack thereof) I will be finish the broad picture over the next couple days and create separate issues to explore the more granular details.

Tokazama commented 4 years ago

Registering a package soon that should help with (or even solve) the array indexing issue here: https://github.com/Tokazama/AxisIndices.jl

behinger commented 4 years ago

not sure where the best place is to write this, I wrote a very basic eeglab .set/.fdt importer: https://github.com/unfoldtoolbox/unfold.jl/blob/master/test/debug_readEEGlab.jl

Needs to be adapted and extended to NeuroCore - I will do so happily once a bit more docu is there :)