bids-standard / pybids-light

A prospective light-weight tool for querying BIDS datasets using the specification schema. May be integrated into pybids in the future.
MIT License
1 stars 1 forks source link

Define scope of project #1

Open tsalo opened 3 years ago

tsalo commented 3 years ago

The primary goal of this project is to create a lightweight library for querying BIDS datasets, using the BIDS schema. While this may ultimately be integrated into pybids as a sort of lightweight-layout module, at the moment we feel it would be best to develop independently.

In order to make this tool as light as possible, we plan to rely on schema-generated regular expressions to find and parse files, without using SQLAlchemy or indexing the full dataset.

Our current plan is discussed in this HackMD.

tsalo commented 3 years ago

@erdalkaraca drafted a proposal here: https://docs.google.com/presentation/d/12x3cQGRD9-T1bkpK--t0e_OdgMb3UuO4jn5fGpj0a4w

tsalo commented 3 years ago

To do list from today's call:

  1. Create new repository for new tool.
    • Work on pybids-compatible interface.
  2. Compile list of functions/methods we need to support with new tool in issue in new repository.
    • Metadata search methods.
    • Include deep parameter for Dataset/BIDSLayout?
  3. Comment on bids-validator repository about using Erdal's approach to validation based on the schema.
  4. Comment on pybids repo about (1) splitting up repository and (2) using new schema-based approach in BIDSLayout. Do this after benchmarking new tool against old one.
  5. Weigh in on schema structure and content.

Does that sound like everything?

EDIT: Tagging @erdalkaraca and Jochem Rieger (IDK your username)

erdalkaraca commented 3 years ago

Thanks, Taylor! I have created a new repo at https://github.com/ANCPLabOldenburg/ancp-bids Will file issues against the new repo for the topics you listed above.

yarikoptic commented 2 years ago

The primary goal of this project is to create a lightweight library for querying BIDS datasets, using the BIDS schema. While this may ultimately be integrated into pybids as a sort of lightweight-layout module, at the moment we feel it would be best to develop independently.

echoing my https://github.com/physiopy/phys2bids/pull/374/files#r809277094 and ongoing https://github.com/nipy/heudiconv/pull/544 I wonder if it would be worthwhile working toward a library not for "querying" per se, but for "manipulation" of bids things, such as

I am not yet 100% sure it would be possible to avoid growing another "pybids", but the goal is to keep it lean/quick to install and use. OR should we really go back to pybids, and see how we could partition it (e.g. pybids-core + pybids; may be just at pypi level) so we could keep improving pybids while being able to gain the "lightweight" version with only needed helpers without full "understand the BIDS dataset universe" bloat?

poldrack commented 2 years ago

I think it would be really unfortunate if another pybids-like package were to be developed, as opposed to working together to improve/extend pybids.

On Thu, Feb 17, 2022 at 9:23 AM Yaroslav Halchenko @.***> wrote:

The primary goal of this project is to create a lightweight library for querying BIDS datasets, using the BIDS schema. While this may ultimately be integrated into pybids as a sort of lightweight-layout module, at the moment we feel it would be best to develop independently.

echoing my https://github.com/physiopy/phys2bids/pull/374/files#r809277094 and ongoing nipy/heudiconv#544 https://github.com/nipy/heudiconv/pull/544 I wonder if it would be worthwhile working toward a library not for "querying" per se, but for "manipulation" of bids things, such as

  • easy parsing and changing (adding/removing entities etc) of BIDS filenames so they remain bids compliant (nipy/heudiconv#544 https://github.com/nipy/heudiconv/pull/544)?
  • basic CLI/Python interfaces for manipulating files (eh, found elderly bids-standard/bidsutils#6 https://github.com/bids-standard/bidsutils/issues/6 by myself, more recent discussion: bids-standard/pybids#774 https://github.com/bids-standard/pybids/issues/774)
    • mv and rm to rename/move and delete a file so it takes care possibly about
      • renaming/deleting also sidecar .json etc file
      • ensuring metadata associated with the file remains the same (which might become not true due to inheritance principle so not only immediate sidecar needs to be considered)
      • adjusting _scans.tsv file records

I am not yet 100% sure it would be possible to avoid growing another "pybids", but the goal is to keep it lean/quick to install and use. OR should we really go back to pybids, and see how we could partition it (e.g. pybids-core + pybids; may be just at pypi level) so we could keep improving pybids while being able to gain the "lightweight" version with only needed helpers without full "understand the BIDS dataset universe" bloat?

— Reply to this email directly, view it on GitHub https://github.com/bids-standard/pybids-light/issues/1#issuecomment-1043212652, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGUVED76ZA7ZDISKKZ4PQDU3UVH3ANCNFSM45YPBYBQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you are subscribed to this thread.Message ID: @.***>

-- Russell A. Poldrack Albert Ray Lang Professor of Psychology Associate Director, Stanford Data Science Director, SDS Center for Open and Reproducible Science Building 420 Stanford University Stanford, CA 94305

@. @.> http://www.poldracklab.org/

tsalo commented 2 years ago

We've started moving the pybids.reports module into a namespace package (https://github.com/bids-standard/pybids-reports). Perhaps something similar could be done with this pybids-light idea. I.e., it could be implemented as a namespace package (e.g., pybids.ext.light) maybe?

yarikoptic commented 2 years ago

but isn't it the other way around kinda -- that pybids-report extends pybids, i.e. have (already heavy) pybids as its install_requires? would make not sense to have a "light" version of pybids which requires heavy pybids... or it doesn't really have to require pybids, and just provide pybids.ext.light while pybids could install_requires it if it would start to use its functionality?

yarikoptic commented 2 years ago

I think it would be really unfortunate if another pybids-like package were to be developed, as opposed to working together to improve/extend pybids.

Totally agree! Starting yet another ball rolling hopefully to avoid that via https://github.com/bids-standard/pybids/issues/818 ;) But echoing @tsalo's comment on pybids-reports, figuring out modularization of pybids seems to be something in dire need .

erdalkaraca commented 2 years ago

With ancpBIDS we have a modular and extendable (using a plugins mechanism) implementation of a "pybids lite" without any "heavyweight dependencies". The library also provides a subset of PyBIDS' BIDSLayout interface. Documentation is work in progress, but you can already have a look at it:

https://ancpbids.readthedocs.io/en/latest/usage.html#query-using-the-pybids-api

We proposed two topics for GSoC 2022 as part of INCF's participation as a mentoring organization:

poldrack commented 2 years ago

given that pybids is probably up for a full refactor at some point, it would be worth it for the pybids team to have a closer look at the ancpBIDS project and see whether it could be a suitable base for pybids-core 2.0. my only goal here is to avoid fragmentation and the inevitable duplication of efforts that comes with it.

On Fri, Feb 18, 2022 at 9:14 AM Erdal Karaca @.***> wrote:

With ancpBIDS we have a modular and extendable (using a plugins mechanism) implementation of a "pybids lite" without any "heavyweight dependencies". The library also provides a subset of PyBIDS' BIDSLayout interface. Documentation is work in progress, but you can already have a look at it:

https://ancpbids.readthedocs.io/en/latest/usage.html#query-using-the-pybids-api

We proposed two topics for GSoC 2022 as part of INCF's participation as a mentoring organization:

  • interactive graph visualization of the BIDS Schema
  • validation engine to handle rules derivable from the BIDS schema and the BIDS Specification

— Reply to this email directly, view it on GitHub https://github.com/bids-standard/pybids-light/issues/1#issuecomment-1044863601, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGUVECU6NJMZUVH4J6EE4DU3Z47VANCNFSM45YPBYBQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you commented.Message ID: @.***>

-- Russell A. Poldrack Albert Ray Lang Professor of Psychology Associate Director, Stanford Data Science Director, SDS Center for Open and Reproducible Science Building 420 Stanford University Stanford, CA 94305

@. @.> http://www.poldracklab.org/

erdalkaraca commented 2 years ago

At least at a conceptual level, ancpBIDS has some interesting approaches to consider:

given that pybids is probably up for a full refactor at some point, it would be worth it for the pybids team to have a closer look at the ancpBIDS project and see whether it could be a suitable base for pybids-core 2.0. my only goal here is to avoid fragmentation and the inevitable duplication of efforts that comes with it. On Fri, Feb 18, 2022 at 9:14 AM Erdal Karaca @.> wrote: With ancpBIDS we have a modular and extendable (using a plugins mechanism) implementation of a "pybids lite" without any "heavyweight dependencies". The library also provides a subset of PyBIDS' BIDSLayout interface. Documentation is work in progress, but you can already have a look at it: https://ancpbids.readthedocs.io/en/latest/usage.html#query-using-the-pybids-api We proposed two topics for GSoC 2022 as part of INCF's participation as a mentoring organization: - interactive graph visualization of the BIDS Schema - validation engine to handle rules derivable from the BIDS schema and the BIDS Specification — Reply to this email directly, view it on GitHub <#1 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGUVECU6NJMZUVH4J6EE4DU3Z47VANCNFSM45YPBYBQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub. You are receiving this because you commented.Message ID: @.> -- Russell A. Poldrack Albert Ray Lang Professor of Psychology Associate Director, Stanford Data Science Director, SDS Center for Open and Reproducible Science Building 420 Stanford University Stanford, CA 94305 @. @.> http://www.poldracklab.org/

ancpJR commented 2 years ago

In close communication with @tsalo back in 2021 we decided to call the library ancpBIDS to prevent any overlap with PyBIDS and get started quickly to explore technical ideas and approaches. Yet, we agreed that a migration path for PyBIDS users should be considered by providing the established BIDSLayout interface as an entry point. We would be happy to discuss about the current implementation and whether it could be a suitable base for pybids-core 2. How should we proceed?

given that pybids is probably up for a full refactor at some point, it would be worth it for the pybids team to have a closer look at the ancpBIDS project and see whether it could be a suitable base for pybids-core 2.0. my only goal here is to avoid fragmentation and the inevitable duplication of efforts that comes with it. On Fri, Feb 18, 2022 at 9:14 AM Erdal Karaca @.> wrote: With ancpBIDS we have a modular and extendable (using a plugins mechanism) implementation of a "pybids lite" without any "heavyweight dependencies". The library also provides a subset of PyBIDS' BIDSLayout interface. Documentation is work in progress, but you can already have a look at it: https://ancpbids.readthedocs.io/en/latest/usage.html#query-using-the-pybids-api We proposed two topics for GSoC 2022 as part of INCF's participation as a mentoring organization: - interactive graph visualization of the BIDS Schema - validation engine to handle rules derivable from the BIDS schema and the BIDS Specification — Reply to this email directly, view it on GitHub <#1 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGUVECU6NJMZUVH4J6EE4DU3Z47VANCNFSM45YPBYBQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub. You are receiving this because you commented.Message ID: @.> -- Russell A. Poldrack Albert Ray Lang Professor of Psychology Associate Director, Stanford Data Science Director, SDS Center for Open and Reproducible Science Building 420 Stanford University Stanford, CA 94305 @. @.> http://www.poldracklab.org/