hdmf-dev / hdmf

The Hierarchical Data Modeling Framework
http://hdmf.readthedocs.io
Other
46 stars 24 forks source link

[Feature]: Running NeuroConv tests on dev branches #1113

Open CodyCBakerPhD opened 1 month ago

CodyCBakerPhD commented 1 month ago

What would you like to see added to HDMF?

From https://github.com/hdmf-dev/hdmf/issues/1111#issue-2286515788

It might be advantageous to setup some dev testing of NeuroConv here on HDMF to ensure PRs don't have ripple effects throughout the ecosystem (for example, NWB Inspector tests against both dev PyNWB and downstream DANDI to ensure both upward and downward compatibility)

What solution would you like?

Here is the initial action design I would suggest (not tested, might need debugging)

In a file deploy_neuroconv_tests.yml...

name: Deploy NeuroConv tests

on:
  pull_request:
  workflow_dispatch:

# Cancel previous workflows on the same pull request
concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: true

jobs:

  example_data_cache:
    uses: ./.github/workflows/example_data_cache.yml
    secrets:
      AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
      AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
      S3_GIN_BUCKET: ${{ secrets.S3_GIN_BUCKET }}

  neuroconv_tests:
    needs: example_data_cache
    uses: ./.github/workflows/neuroconv_tests.yml

In example_data_cache.yml...

name: Example data cache
on:
  workflow_call:
    secrets:
      AWS_ACCESS_KEY_ID:
        required: true
      AWS_SECRET_ACCESS_KEY:
        required: true
      S3_GIN_BUCKET:
        required: true
  workflow_dispatch:

jobs:

    name: Example data cache
    runs-on: ${{ matrix.os }}
    strategy:
      fail-fast: false
      matrix:
        python-version: ["3.12"]
        os: [ubuntu-latest]

      - name: Get ephy_testing_data current head hash
        id: ephys
        run: echo "::set-output name=HASH_EPHY_DATASET::$(git ls-remote https://gin.g-node.org/NeuralEnsemble/ephy_testing_data.git HEAD | cut -f1)"
      - name: Get cached ephys example data - ${{ steps.ephys.outputs.HASH_EPHY_DATASET }}
        uses: actions/cache@v4
        id: cache-ephys-datasets
        with:
          path: ./ephy_testing_data
          key: ephys-datasets-${{ matrix.os }}-${{ steps.ephys.outputs.HASH_EPHY_DATASET }}

      - name: Get ophys_testing_data current head hash
        id: ophys
        run: echo "::set-output name=HASH_OPHYS_DATASET::$(git ls-remote https://gin.g-node.org/CatalystNeuro/ophys_testing_data.git HEAD | cut -f1)"
      - name:Get cached ophys example data - ${{ steps.ophys.outputs.HASH_OPHYS_DATASET }}
        uses: actions/cache@v4
        id: cache-ophys-datasets
        with:
          path: ./ophys_testing_data
          key: ophys-datasets-${{ matrix.os }}-${{ steps.ophys.outputs.HASH_OPHYS_DATASET }}

      - name: Get behavior_testing_data current head hash
        id: behavior
        run: echo "::set-output name=HASH_BEHAVIOR_DATASET::$(git ls-remote https://gin.g-node.org/CatalystNeuro/behavior_testing_data.git HEAD | cut -f1)"
      - name: Get cached behavior example data - ${{ steps.behavior.outputs.HASH_BEHAVIOR_DATASET }}
        uses: actions/cache@v4
        id: cache-behavior-datasets
        with:
          path: ./behavior_testing_data
          key: behavior-datasets-${{ matrix.os }}-${{ steps.behavior.outputs.HASH_behavior_DATASET }}

      - if: steps.cache-ephys-datasets.outputs.cache-hit != 'true' || steps.cache-ophys-datasets.outputs.cache-hit != 'true' || steps.cache-behavior-datasets.outputs.cache-hit != 'true'
        name: Install and configure AWS CLI
        run: |
          pip install awscli
          aws configure set aws_access_key_id ${{ secrets.AWS_ACCESS_KEY_ID }}
          aws configure set aws_secret_access_key ${{ secrets.AWS_SECRET_ACCESS_KEY }}
      - if: steps.cache-ephys-datasets.outputs.cache-hit != 'true'
        name: Download ephys dataset from S3
        run: aws s3 cp --recursive --region=us-east-2 ${{ secrets.S3_GIN_BUCKET }}/ephy_testing_data ./ephy_testing_data
      - if: steps.cache-ophys-datasets.outputs.cache-hit != 'true'
        name: Download ophys dataset from S3
        run: aws s3 cp --recursive --region=us-east-2 ${{ secrets.S3_GIN_BUCKET }}/ophys_testing_data ./ophys_testing_data
      - if: steps.cache-behavior-datasets.outputs.cache-hit != 'true'
        name: Download behavior dataset from S3
        run: aws s3 cp --recursive --region=us-east-2 ${{ secrets.S3_GIN_BUCKET }}/behavior_testing_data ./behavior_testing_data

In neuroconv_tests.yml...

name: NeuroConv Tests
on:
  workflow_call:
  workflow_dispatch:

jobs:

  run:
    name: Ubuntu - Python ${{ matrix.python-version }}
    runs-on: ${{ matrix.os }}
    strategy:
      fail-fast: false
      matrix:
        python-version: ["3.8", "3.12"]
        os: [ubuntu-latest]
    steps:
      - name: Setup Python
        uses: actions/setup-python@v5
        with:
          python-version: ${{ matrix.python-version }}

      - name: Global Setup
        run: |
          python -m pip install -U pip  # Official recommended way
          pip install pytest-xdist
          git config --global user.email "CI@example.com"
          git config --global user.name "CI Almighty"
          pip install wheel # Needed for scan image

      - name: Install NeuroConv with full requirements
        run: pip install --no-cache-dir neuroconv[full,test]

      - uses: actions/checkout@v4
      - run: git fetch --prune --unshallow --tags
      - name: Install current branch of HDMF
        run: |
          pip uninstall hdmf
          pip install .

      - name: Get ephy_testing_data current head hash
        id: ephys
        run: echo "::set-output name=HASH_EPHY_DATASET::$(git ls-remote https://gin.g-node.org/NeuralEnsemble/ephy_testing_data.git HEAD | cut -f1)"
      - name: Get cached ephys example data - ${{ steps.ephys.outputs.HASH_EPHY_DATASET }}
        uses: actions/cache@v4
        id: cache-ephys-datasets
        with:
          path: ./ephy_testing_data
          key: ephys-datasets-${{ matrix.os }}-${{ steps.ephys.outputs.HASH_EPHY_DATASET }}

      - name: Get ophys_testing_data current head hash
        id: ophys
        run: echo "::set-output name=HASH_OPHYS_DATASET::$(git ls-remote https://gin.g-node.org/CatalystNeuro/ophys_testing_data.git HEAD | cut -f1)"
      - name:Get cached ophys example data - ${{ steps.ophys.outputs.HASH_OPHYS_DATASET }}
        uses: actions/cache@v4
        id: cache-ophys-datasets
        with:
          path: ./ophys_testing_data
          key: ophys-datasets-${{ matrix.os }}-${{ steps.ophys.outputs.HASH_OPHYS_DATASET }}

      - name: Get behavior_testing_data current head hash
        id: behavior
        run: echo "::set-output name=HASH_BEHAVIOR_DATASET::$(git ls-remote https://gin.g-node.org/CatalystNeuro/behavior_testing_data.git HEAD | cut -f1)"
      - name: Get cached behavior example data - ${{ steps.behavior.outputs.HASH_BEHAVIOR_DATASET }}
        uses: actions/cache@v4
        id: cache-behavior-datasets
        with:
          path: ./behavior_testing_data
          key: behavior-datasets-${{ matrix.os }}-${{ steps.behavior.outputs.HASH_behavior_DATASET }}

      - name: Checkout tools repo
        uses: actions/checkout@v4
        with:
          repository: catalystneuro/neuroconv
          path: neuroconv

      - name: Run NeuroConv tests
        run: pytest -vv -rsx -n auto --dist loadscope

Do you have any interest in helping implement the feature?

Yes.

CodyCBakerPhD commented 1 month ago

@mavaylon1 In advance of our meeting

mavaylon1 commented 1 month ago

Good stuff

mavaylon1 commented 1 month ago

We would need to have the values of the secrets you use as well as the matrix of configurations.