Notes on install at OSU

hohonuuli commented 5 months ago

To get started I did the following:

Installed homebrew
Installed docker
Installed JDK 21 (adoption)
Installed VSCode
```
### Tasks
```
[x] Install miniconda for all users
[x] install vars-gridview
[x] Get sample videos
[x] Write name munging for sample videos to get into standard name format
[x] Register sample videos in vampire-squid
[x] Test vars-annotations/sharktpoda with sample video
[x] Set up pythia
[x] Test pythia in VARS
[x] ~~Write merge code for PICA metadata~~
[x] Register Astrid's benthic cam videos
[x] Test annotating MBARI data from OSU (Astrid is finishing some transects)
[x] Get Astrid a sample CSV of ancillary data
[x] Get custom yolov8 model from Kris/Lonny

m3-quickstart is installed on on /Users/annotationstation1/workspace/m3-quickstart

hohonuuli commented 5 months ago

We're on wifi, so we don't really have a fixed hostname. I'm setting up the annotationstation1 to use localhost. This is set in bin/docker-env.sh. I'll have to change that when they get the machine wired and set up a 2nd annotation station.

hohonuuli commented 5 months ago

On a fresh install, the vars-kb-server and the vars-user-server were throwing errors. I think this is due to a copy of MBARI's KB being loaded, which takes some time. I let it sit for a few minutes and restarted the services (docker_stop.sh && docker_start.sh) and they came up fine.

hohonuuli commented 5 months ago

Running vars_build.sh results in:

--- BUILDING VARS Knowledgebase Application
Base Directory:      /Users/annotationstation1/workspace/m3-quickstart/bin
HOST:                http://127.0.0.1
ANNOSAURUS REST URL: http://127.0.0.1:8082/v1
KB JDBC URL:         jdbc:postgresql://127.0.0.1:5432/M3_VARS?sslmode=disable&stringType=unspecified
Building vars-kb in /Users/annotationstation1/workspace/m3-quickstart/temp/repos/vars-kb
Cloning into 'vars-kb'...
remote: Enumerating objects: 2503, done.
remote: Counting objects: 100% (291/291), done.
remote: Compressing objects: 100% (184/184), done.
remote: Total 2503 (delta 118), reused 227 (delta 74), pack-reused 2212
Receiving objects: 100% (2503/2503), 3.72 MiB | 16.57 MiB/s, done.
Resolving deltas: 100% (1216/1216), done.
Downloading https://services.gradle.org/distributions/gradle-8.8-bin.zip
.............10%.............20%.............30%.............40%.............50%.............60%..............70%.............80%.............90%.............100%

Welcome to Gradle 8.8!

Here are the highlights of this release:
 - Running Gradle on Java 22
 - Configurable Gradle daemon JVM
 - Improved IDE performance for large projects

For more details see https://docs.gradle.org/8.8/release-notes.html

Starting a Gradle Daemon (subsequent builds will be faster)

> Configure project :
Project :org.mbari.kb.core => 'org.mbari.kb.core' Java module
Jdeps Gradle plugin 0.20.0. Consider becoming a patron at https://www.patreon.com/aalmiray
Project :org.mbari.kb.jpa => 'org.mbari.kb.jpa' Java module
Project :org.mbari.kb.shared => 'org.mbari.kb.shared' Java module
Project :org.mbari.kb.ui => 'org.mbari.kb.ui' Java module

> Task :org.mbari.kb.core:compileJava FAILED

FAILURE: Build failed with an exception.

* What went wrong:
Execution failed for task ':org.mbari.kb.core:compileJava'.
> Could not resolve all files for configuration ':org.mbari.kb.core:compileClasspath'.
   > Could not resolve org.mbari:mbarix4j:2.0.5.jre11.
     Required by:
         project :org.mbari.kb.core
      > Could not resolve org.mbari:mbarix4j:2.0.5.jre11.
         > Could not get resource 'https://maven.pkg.github.com/mbari-org/maven/org/mbari/mbarix4j/2.0.5.jre11/mbarix4j-2.0.5.jre11.pom'.
            > Could not GET 'https://maven.pkg.github.com/mbari-org/maven/org/mbari/mbarix4j/2.0.5.jre11/mbarix4j-2.0.5.jre11.pom'. Received status code 403 from server: Forbidden
> There is 1 more failure with an identical cause.

* Try:
> Run with --stacktrace option to get the stack trace.
> Run with --info or --debug option to get more log output.
> Run with --scan to get full insights.
> Get more help at https://help.gradle.org.

Deprecated Gradle features were used in this build, making it incompatible with Gradle 9.0.

You can use '--warning-mode all' to show the individual deprecation warnings and determine if they come from your own scripts or plugins.

For more on this, please refer to https://docs.gradle.org/8.8/userguide/command_line_interface.html#sec:command_line_warnings in the Gradle documentation.

BUILD FAILED in 26s
1 actionable task: 1 executed
--- BUILDING VARS Query Application
Base Directory:      /Users/annotationstation1/workspace/m3-quickstart/bin
HOST:                http://127.0.0.1
ANNOSAURUS JDBC URL: jdbc:postgresql://127.0.0.1:5432/M3_VARS?sslmode=disable&stringType=unspecified
KB REST URL:         http://127.0.0.1:8083/kb/v1
Building vars-query in /Users/annotationstation1/workspace/m3-quickstart/temp/repos/vars-query
Cloning into 'vars-query'...
remote: Enumerating objects: 1492, done.
remote: Counting objects: 100% (152/152), done.
remote: Compressing objects: 100% (94/94), done.
remote: Total 1492 (delta 51), reused 111 (delta 32), pack-reused 1340
Receiving objects: 100% (1492/1492), 3.05 MiB | 11.84 MiB/s, done.
Resolving deltas: 100% (779/779), done.
Downloading https://services.gradle.org/distributions/gradle-8.6-bin.zip
............10%.............20%............30%.............40%.............50%............60%.............70%.............80%............90%.............100%

Welcome to Gradle 8.6!

Here are the highlights of this release:
 - Configurable encryption key for configuration cache
 - Build init improvements
 - Build authoring improvements

For more details see https://docs.gradle.org/8.6/release-notes.html

Starting a Gradle Daemon (subsequent builds will be faster)
> Task :compileJava FAILED

FAILURE: Build failed with an exception.

* What went wrong:
Execution failed for task ':compileJava'.
> Could not resolve all files for configuration ':compileClasspath'.
   > Could not resolve com.guigarage:sdkfx:0.3.0.jre21.
     Required by:
         project :
      > Could not resolve com.guigarage:sdkfx:0.3.0.jre21.
         > Could not get resource 'https://maven.pkg.github.com/mbari-org/maven/com/guigarage/sdkfx/0.3.0.jre21/sdkfx-0.3.0.jre21.pom'.
            > Could not GET 'https://maven.pkg.github.com/mbari-org/maven/com/guigarage/sdkfx/0.3.0.jre21/sdkfx-0.3.0.jre21.pom'. Received status code 403 from server: Forbidden
   > Could not resolve org.bushe:eventbus:1.5.
     Required by:
         project :
      > Could not resolve org.bushe:eventbus:1.5.
         > Could not get resource 'https://maven.pkg.github.com/mbari-org/maven/org/bushe/eventbus/1.5/eventbus-1.5.pom'.
            > Could not GET 'https://maven.pkg.github.com/mbari-org/maven/org/bushe/eventbus/1.5/eventbus-1.5.pom'. Received status code 403 from server: Forbidden
   > Could not resolve org.mbari:mbarix4j:2.0.5.jre11.
     Required by:
         project :
      > Could not resolve org.mbari:mbarix4j:2.0.5.jre11.
         > Could not get resource 'https://maven.pkg.github.com/mbari-org/maven/org/mbari/mbarix4j/2.0.5.jre11/mbarix4j-2.0.5.jre11.pom'.
            > Could not GET 'https://maven.pkg.github.com/mbari-org/maven/org/mbari/mbarix4j/2.0.5.jre11/mbarix4j-2.0.5.jre11.pom'. Received status code 403 from server: Forbidden

* Try:
> Run with --stacktrace option to get the stack trace.
> Run with --info or --debug option to get more log output.
> Run with --scan to get full insights.
> Get more help at https://help.gradle.org.

Deprecated Gradle features were used in this build, making it incompatible with Gradle 9.0.

You can use '--warning-mode all' to show the individual deprecation warnings and determine if they come from your own scripts or plugins.

For more on this, please refer to https://docs.gradle.org/8.6/userguide/command_line_interface.html#sec:command_line_warnings in the Gradle documentation.

BUILD FAILED in 12s
1 actionable task: 1 executed

hohonuuli commented 5 months ago

I've been fighting a battle with GitHub personal access tokens. The default (classic) are no longer able to read from MBARI's GitHub packages. I had to switch to the fine-grained tokens (in beta). Those tokens can read for GitHub packages, but I can't get it to write.

This broke the KB build as it depends on an unreleased version of mbarix4j. The workaround was to copy most of the mbarix4j code into the project and remove that dependency. So it now builds.

A note that the JVM has changed so much since the vars-kb was written that the app is barely usable anymore. Need to roll out https://github.com/mbari-org/oni and a new web interface ASAP.

hohonuuli commented 5 months ago

Building the vars-query is failing for the same reasons above, I can't publish updated artifacts to GitHub packages. Also, the build was failing with maven 3.9.7 (the latest) as noted at https://stackoverflow.com/questions/78542808/maven-project-fails-to-resolve-javafx-dependencies. Working on fixes.

hohonuuli commented 5 months ago

Workaround is to install sdkfx and eventbus locally.

hohonuuli commented 5 months ago

Note that the VARS KB is not deleting concepts. Need to investigate that.

hohonuuli commented 5 months ago

Installed vars-gridview, but it's built for SQL Server only and not postgres. Issue I ran into for the install are mbari-org/vars-gridview#77, mbari-org/vars-gridview#78, and mbari-org/vars-gridview#79

hohonuuli commented 5 months ago

To register videos:

tl;dr

cd ~/workspace/m3-quickstart/bin
conda activate m3-quickstart

# ./vars_register_medias_on_web.sh <camera name> <deployment name> <url to directory listing> -e
./vars_register_medias_on_web.sh "PICA" "PICA 008" "http://annotationstation.ceoas.oregonstate.edu/media/PICA008/" -e

[!WARNING] The -e flag tells the script to extract the video's creation time from the video metadata. If omitted, the script will parse the timestamp from the filename. For Pica cam, the name has the wrong timestamp because

it's in local time instead of UTC

Files in a sequence will all have the same timestamp, followed by a sequence number

So, it's very important to remember to use the -e flag with the PICA cam

The -e flag is not needed for the TopoCam

[!NOTE] Make sure the url ends with a / or you will get an exception.

Details

Open /Users/annotationstation1/workspace/m3-quickstart/temp/media in Finder
Under that directory (media) create a directory for your video deployment. For example TopoCam/11.
Copy your videos into the deployment directory you just created.
Browse to http://annotationstation.ceoas.oregonstate.edu/media and navigate to the page listing all the videos you want to register.
Run the ./vars-register_medias_on_web.sh command with the correct arguments.
- Make sure the url ends with /
- If registering Pica cam files, make sure to use the -e flag.

hohonuuli commented 5 months ago

Reading a CNV file in python

The PICA cam has a Seabird CTD attached. I'm including the sample file here. The best way to read it in python is using the seabird package (pip install seabird). I ran a test with it:

>>> import seabird
>>>  from seabird.cnv import fCNV
  File "<stdin>", line 1
    from seabird.cnv import fCNV
IndentationError: unexpected indent
>>> from seabird.cnv import fCNV
>>> profile = fCNV('/Users/annotationstation1/workspace/m3-quickstart/temp/media/PICA005/DunkTest1_20240522_23_24_59.cnv')
>>> profile.attributes
{'sbe_model': '19plus V2', 'seasave': 'V 7.26.7.121', 'instrument_type': 'CTD', 'nquan': '27', 'nvalues': '2607', 'start_time': 'May 22 2024 23:24:59 [System UTC, header]', 'bad_flag': '-9.990e-29', 'file_type': 'ascii', 'md5': 'ad739e8af38ad74e4797be5d22fb09af', 'datetime': datetime.datetime(2024, 5, 22, 23, 24, 59), 'filename': 'DunkTest1_20240522_23_24_59.cnv'}
>>> profile.keys()
['altM', 'CNDC', 'c0mS/cm', 'c0uS/cm', 'density', 'DEPTH', 'descentrate', 'flECO-AFL', 'oxygenvoltage', 'oxygen_ml_L', 'prdM', 'potemperature', 'potemp068C', 'tv290C', 'tv268C', 'turbWETntu0', 'timeS', 'timeM', 'timeH', 'timeJ', 'v0', 'v1', 'v2', 'v3', 'v4', 'v5', 'flag']
>>> profile['DEPTH']
masked_array(data=[-0.077, -0.12 , -0.12 , ..., -0.119, -0.14 , -0.13 ],
             mask=False,
       fill_value=-9.99e-29)
>>> profile['potemperature']
masked_array(data=[14.3776, 14.3775, 14.3774, ..., 14.3658, 14.3658,
                   14.366 ],
             mask=False,
       fill_value=-9.99e-29)

DunkTest1_20240522_23_24_59.cnv.zip

[!NOTE] I'll write a transform later for Astrid when she has a file with GPS included after a ship deployment.

hohonuuli commented 5 months ago

Added Astrid (@abruptbathylab) with read access to https://github.com/mbari-org/m3-download so she can pull changes in the future. Note that m3-download uses pymssql to execute the following query, so it doesn't work with postgres:

SELECT
    a.uuid AS association_uuid, 
    o.uuid AS observation_uuid, 
    ir.uuid AS image_reference_uuid, 
    o.concept, 
    a.to_concept, 
    a.link_value, 
    ir.url, 
    o.activity, 
    o.observation_group,
    im.recorded_timestamp,
    im.elapsed_time_millis,
    im.timecode,
    vs.name,
    v.start_time,
    vr.width AS video_reference_width,
    vr.height AS video_reference_height
FROM
    M3_ANNOTATIONS.dbo.associations a INNER JOIN
    M3_ANNOTATIONS.dbo.observations o ON a.observation_uuid = o.uuid INNER JOIN 
    M3_ANNOTATIONS.dbo.imaged_moments im ON o.imaged_moment_uuid = im.uuid INNER JOIN
    M3_VIDEO_ASSETS.dbo.video_references vr ON im.video_reference_uuid = vr.uuid INNER JOIN
    M3_VIDEO_ASSETS.dbo.videos v ON vr.video_uuid = v.uuid INNER JOIN
    M3_VIDEO_ASSETS.dbo.video_sequences vs ON v.video_sequence_uuid = vs.uuid LEFT JOIN 
    M3_ANNOTATIONS.dbo.image_references ir ON JSON_VALUE(a.link_value, '$.image_reference_uuid') = ir.uuid
WHERE
    a.link_name = 'bounding box'

EDIT: I submitted mbari-org/m3-download#2 to investigate removing the SQL dependency so Astrid can download training sets from her data.

hohonuuli commented 5 months ago

Added pythia support to Astrid's docker compose file. I converted the fathomnet meg detector to torch script and renamed it best.torchscript and also create best.names file with a single line of object. I put both in docker/pythia. I added the following to the docker ocmpose:

  pythia:
    image: mbari/pythia
    restart: always
    ports:
      - "9999:8080"
    environment:
      - YOLO_VERSION=8
      - YOLOV5_RESOLUTION=1280
      # - YOLOV5_RESOLUTION=640
    volumes:
      - ${BASE_DIR}/docker/pythia:/opt/models

    networks:
      - m3
    command: run /opt/models/mbari_astrid_osu_yolov8_1280_2024-06-12.torchscript /opt/models/mbari_astrid_osu_yolov8_1280_2024-06-12.names
    # command: run /opt/models/fathomnet_megalodon.torchscript /opt/models/fathomnet_megalodon.names

hohonuuli commented 5 months ago

Associations are not show in alphabetical order in the add association compbo box or when they add a quick button. Fix this in VARS

hohonuuli commented 5 months ago

Astrid was testing annotating on MBARI's db across VPN. Annotation works just fine, but Sharktopoda does not connect to VARS when VPN is up. This means Sharktopoda still works as a video player, but you can't localize using Sharktopoda as it can't tell VARS about new localizations.

hohonuuli commented 5 months ago

Changing annotation systems

On the VARS Annotation toolbar, select the gear button (settings) and then in the dialog that pops up, the Configuration Server button.

MBARI

Field	Value
URL	`http://m3.shore.mbari.org/config`
VARS Username	`astridl`
VARS Password	your VARS password

OSU

Field	Value
URL	`http://annotationstation.ceoas.oregonstate.edu/config`
VARS Username	`admin`
VARS Password	the admin password

hohonuuli commented 5 months ago

Merging seabird data

I added a script bin/osu_merge_seabird.sh that calls bin/etc/osu_merge_seabird.py. Usage is:

usage: osu_merge_seabird.py [-h] cnv_file video_sequence_name year

positional arguments:
  cnv_file             The cnv file to convert to csv
  video_sequence_name  Video Sequence Name is the deployment ID or expedition ID of the video. e.g.
                       'Doc Ricketts 1234'
  year                 The year to use for the timeJ conversion

options:
  -h, --help           show this help message and exit

It's really a starting point as it will need to be modified when you start capturing position. Maybe Savana can updated it?. It will need to have it __parse method modified when lat and long are available. Currently its:

def __parse(cnv_file: str, year: int):
    profile = fCNV(cnv_file)
    n = len(profile['timeJ'])
    for i in range(n):
        timeJ = profile['timeJ'][i]
        altitude = profile["altM"][i]
        depth_meters = profile["DEPTH"][i]
        temperature = profile["potemperature"][i]
        oxygen = profile["oxygen_ml_L"][i]

        # https://info.seabird.com/2026_SeaBird_c-mult_c-June-Newsletter_landing-Page-2.html
        dt = datetime.datetime(year, 1, 1) + datetime.timedelta(days=(timeJ - 1))
        date = dt.strftime("%Y-%m-%dT%H:%M:%S.%fZ")

        yield {
          #   "latitude": latitude,
          #   "longitude": longitude,
            "depth_meters": depth_meters,
            "temperature_celsius": temperature,
            "oxygen_ml_l": oxygen,
          #   "salinity": salinity,
            "recorded_timestamp": date,
            "altitude": altitude,
        }

hohonuuli commented 5 months ago

Notes from @lonnylundsten and @kwalz on the model they trained for Astrid:

From Kris:

Hi Brian and Astrid,
Attached is the mapping we did to Astrid's complexes with our concepts (third column was a number Astrid had on her complexes so I left here), a few have no localizations like Nanomia 2 and krill ind. Let us know if you have any questions on these.
kris

From Lonny, the final counts of localizations per complex (note: eggs was associated with Teuthoidea eggcase, we left it in because it was too hard to tease those apart and number low): 

These are the final counts:

Aegina spp, 2813
Appendicularian, 2320
Beroe_cmplx, 3931
Chaetognatha, 4066
Cydippid_cmplx, 3277
Earleria, 716
Euphausiacea, 5477
Eusergestes similis, 3777
Halicreatidae, 1058
Hastigerinella digitata, 936
Krill molt, 866
Lobata, 8963
Medusae_unID, 412
Merluccius productus, 7213
Mycto_Leuro cmplx, 1848
Mysida, 2484
Nanomia bijuga, 9417
Pasiphaea pacifica, 848
Physonectae, 1606
Poeobius meseres, 3113
Prayid_rockets, 3020
Pyrosoma, 4577
Salp, 2650
Sebastes, 1839
Siphonophorae, 691
Solmissus, 5308
Teuthoidea, 6865
eggs, 59

OSU_conceptmapping.xlsx

From Lonny:

Model training just finished. It took 25 hours. Where would you like me to put the model and associated data Brian? I can put it in M3_ML on titan if that’s easy enough for you to grab?

50 epochs completed in 25.301 hours.
Optimizer stripped from runs/detect/train24/weights/[last.pt](http://last.pt/), 136.9MB
Optimizer stripped from runs/detect/train24/weights/[best.pt](http://best.pt/), 136.9MB

Validating runs/detect/train24/weights/[best.pt](http://best.pt/)...
Ultralytics YOLOv8.2.31 🚀 Python-3.10.14 torch-2.3.1 CUDA:0 (NVIDIA L4, 22478MiB)
                                                      CUDA:1 (NVIDIA L4, 22478MiB)
                                                      CUDA:2 (NVIDIA L4, 22478MiB)
                                                      CUDA:3 (NVIDIA L4, 22478MiB)
                                                      CUDA:4 (NVIDIA L4, 22478MiB)
                                                      CUDA:5 (NVIDIA L4, 22478MiB)
                                                      CUDA:6 (NVIDIA L4, 22478MiB)
                                                      CUDA:7 (NVIDIA L4, 22478MiB)
Model summary (fused): 268 layers, 68150532 parameters, 0 gradients, 257.5 GFLOPs
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 892/892 [0
                   all       7131       9175      0.657      0.696      0.719      0.485
            Teuthoidea        509        775      0.859      0.848      0.913      0.685
    Pasiphaea pacifica         85         87      0.762      0.782      0.844      0.543
      Poeobius meseres        300        313      0.576      0.553      0.638       0.36
                Lobata        905        907      0.804      0.931      0.945      0.804
              Pyrosoma        254        489       0.73      0.675      0.723      0.428
        Nanomia bijuga        776        985      0.571      0.709      0.672      0.419
        Cydippid_cmplx        328        332      0.753      0.801       0.83      0.569
        Prayid_rockets        257        301      0.548      0.648      0.619      0.393
           Beroe_cmplx        368        371      0.765      0.798      0.858      0.645
       Appendicularian        249        253      0.696      0.787      0.844      0.589
                Mysida        221        221      0.735      0.842      0.826      0.467
            Aegina spp        254        267       0.66      0.831      0.795      0.522
              Earleria         70         70      0.707      0.828      0.865      0.546
                  Salp        262        272      0.568      0.721        0.7      0.487
   Eusergestes similis        372        389      0.594       0.63      0.636      0.342
          Medusae_unID         34         34      0.456      0.173      0.229      0.189
             Solmissus        515        533      0.896       0.91      0.952      0.768
              Sebastes        126        169      0.653      0.828       0.77      0.578
          Chaetognatha        390        401      0.536      0.589      0.633      0.348
            Krill molt         78         85      0.412      0.412      0.381      0.165
         Halicreatidae        114        115      0.829      0.887      0.917      0.694
           Physonectae        105        143      0.773      0.804      0.825      0.551
          Euphausiacea        340        584      0.545      0.355      0.453      0.225
Hastigerinella digitata         85         87      0.576      0.621      0.666      0.349
     Mycto_Leuro cmplx        152        171      0.556      0.462      0.541       0.35
  Merluccius productus        165        742      0.573      0.686      0.683      0.418
         Siphonophorae         72         72      0.381      0.375      0.368      0.242
                  eggs          7          7      0.887          1      0.995      0.899
Speed: 0.4ms preprocess, 32.6ms inference, 0.0ms loss, 0.7ms postprocess per image

The model and associated training stuff is available here:
smb://titan.shore.mbari.org/M3_ML/2024/mbari_astrid_osu_yolov8_1280_2024-06-12/

confusion_matrix_normalized

Model conversion

For pythia, the model needs to be converted to torch script and a names files is need.

conda activate ultralytics
yolo export model=mbari_astrid_osu_yolov8_1280_2024-06-12.pt format=torchscript imgsz=1280

hohonuuli commented 5 months ago

Attaching the diff for OSUs install:

osu.patch

mbari-org / m3-quickstart