Signal-K / sytizen

Citizen Science (Sci-tizen) visualisation in the Unity.com engine
http://ar.skinetics.tech/stellarios/compass/#vis
MIT License
1 stars 1 forks source link

🤖🐚 ↝ [GP-43]: Initialise cloud data images, algo to populate them into anomalies #46

Open Gizmotronn opened 5 months ago

Gizmotronn commented 5 months ago

Each cloud datapoint should point to an anomaly of anomalytype == planet. Right now we'll just pull in images, randomly assign them and then upload them to the relevant bucket, and update the pointer of the cloud point in the anomalies table

Alternatively, we could structure it that we pull in images in the cloud part of the storage bucket /dir of the setting anomaly, the classification points to the storage element so we don't need another field in anomalies.

So we can do some exploring. There won't be hugely complex algorithms, we'll maybe add an additional field to the cloud classification process later like "select a planet you know that looks like this cloud could be from" (in which case we'd need a field in anomalies table

Executive decision: to achieve consistency, there will be a field in anomalies, however they will have a parentItem of a particular setting/planet anomaly row

Gizmotronn commented 5 months ago

Great idea -> the method for verifying/assigning users other classifications is they have to classify all of the data on their planet (or they have the option to... unlocks a "completion status" thing/concept. Once we have more data for each anomaly [type] we can then reduce the threshold to like, say, 65%). This means that we then have a pipeline for having users classify things that have already been classified [by other users] to increase validity likelihood.

...Now we just need to link this in with the feed, somehow

Gizmotronn commented 5 months ago

An update:

image
Gizmotronn commented 5 months ago

The main logic for the cloud data is already such that there's automatic filling of data, and I briefly (for about 15 seconds) thought about writing a python snippet anyway so that we can pull the images from the anomalies.configuration column into the storage buckets, but I realised that there's no point - just adds extra storage and I don't think the lag issues are as pronounced as they seem to be in the dev environment (in terms of having the images show up in the Camera Module). So we should be fine.

I'm going to move onto the surveyor module and come up with an automated response for each anomaly...

Gizmotronn commented 5 months ago

So far what I'm thinking for the surveyor module:

create table
  public.anomalies (
    id bigint generated by default as identity,
    content text null,
    "ticId" text null,
    anomalytype text null,
    type text null,
    radius double precision null,
    mass double precision null,
    density double precision null,
    gravity double precision null,
    "temperatureEq" double precision null,
    temperature double precision null,
    smaxis double precision null,
    orbital_period double precision null,
    classification_status text null,
    avatar_url text null,
    created_at timestamp with time zone not null default now(),
    deepnote text null,
    lightkurve text null,
    configuration jsonb null,
    constraint baseplanets_pkey primary key (id)
  ) tablespace pg_default;

Here's an example content for the configuration column:

{
  "mass": 2.14,
  "ticId": "Kepler-69",
  "radius": 1.71,
  "smaxis": 0.64,
  "ticId2": "KOI 172.02",
  "density": 2.36,
  "gravity": 0.73,
  "lightkurve": "https://qwbufbmxkjfaikoloudl.supabase.co/storage/v1/object/public/planetsss/_1710155300825",
  "temperature": 325.15,
  "orbital_period": 242.47,
  "temperature_eq": 548.15
}

Each anomaly represents a planet, which obviously has data like the temperature of "325.15 Kelvin" or the orbital period, etc

Users can make classifications for any missing values or update any values, here's the schema for the classifications table:

create table
  public.classifications (
    id bigint generated by default as identity,
    created_at timestamp with time zone not null default now(),
    content text null,
    author uuid null,
    anomaly bigint null,
    media json null,
    constraint classifications_pkey primary key (id),
    constraint classifications_anomaly_fkey foreign key (anomaly) references anomalies (id),
    constraint classifications_author_fkey foreign key (author) references profiles (id)
  ) tablespace pg_default;

a python snippet file/script that will pull in all classifications where classificationtype = any field matching the configuration element for the anomalies (in this case, it would be like mass, temperature, orbital_period, gravity, etc). So we should extract the "columns" from the configuration field, then look through classifications table for any rows where classificationtype = one of those fields. Then, it should group all those classifications by the anomaly & classificationtype, average them out (e.g. if there's two classifications mentioning "gravity" for anomaly == 1, it should average the values out)

I'm thinking about having some sort of way to "verify" changes, or add some extra values to anomalies.configuration value for like "userSuggestions" or something. But for now just averaging them out and seeing the results locally would be a good start

Gizmotronn commented 5 months ago
image

Looks like a bit more debugging is required...

Gizmotronn commented 5 months ago

43

Gizmotronn commented 4 months ago

Started to work on this: https://deepnote.com/workspace/star-sailors-49d2efda-376f-4329-9618-7f871ba16007/project/Step-by-step-50ad3984-69a9-496e-a121-efb59231e7e9/notebook/Gen%20AI%20Stuff-4c10e2cf5db441c2b1cfd69079cc85f4

more info coming soon