Closed cmosig closed 2 months ago
@mmaelicke we will have 6TB Orthophoto of California. How should be integrate this?
FYI: current stats
I think we need to discuss this in a short meeting next week. Not sure yet...
Alright :+1: Please send quick message when ready for meeting
build_cog
route in new branchthis is a first, super ugly draft of the upload script. Only for one geotif and labe:
import requests
from supabase import create_client
import json
from pydantic_geojson import MultiPolygonModel, PolygonModel
import geopandas as gpd
BASE_URL = "http://0.0.0.0:8762"
GEOTIFF_FILE = "/Users/januschvajna-jehle/data/deadwood-example-data/orthos/uavforsat_2017_CFB044_ortho.tif"
LABELS_FILE = "/Users/januschvajna-jehle/data/deadwood-example-data/labels_aoi/uavforsat_2017_CFB044_ortho_polygons.gpkg"
# LABELS_FILE = 'uavforsat_2017_CFB044_labels.geojson'
SUPABASE_KEY = ""
SUPABASE_URL = ""
USER = "jesjehle@gmx.de"
PASSWORD = ""
client = create_client(SUPABASE_URL, SUPABASE_KEY)
client.auth.sign_in_with_password({"email": USER, "password": PASSWORD})
auth_response = client.auth.refresh_session()
session = auth_response.session
access_token = session.access_token
user_id = session.user.id
res = None
with open(GEOTIFF_FILE, "rb") as f:
upload_res = requests.post(
BASE_URL + "/datasets",
files={"file": f},
headers={"Authorization": f"Bearer {access_token}"},
)
upload_res_json = upload_res.json()
dataset_id = upload_res_json["id"]
name = upload_res_json["file_name"]
# sample output
# {'id': 248,
# 'file_name': '1ba98c89-6b76-4402-bd7c-4680cb0a0c8b_uavforsat_2017_CFB044_ortho.tif',
# 'file_alias': 'uavforsat_2017_CFB044_ortho.tif',
# 'file_size': 1036120927, 'copy_time': 13.869181871414185,
# 'sha256': '4f8ca9a808442eae8f0a34d53a0ffcfa0f5698e4b5fed90fdd0de067529d4f82',
# 'bbox': 'BOX(8.116694192013465 48.17413731568594, 8.11973164880153 48.17625264529826)',
# 'status': 'pending', 'user_id': '6afa4242-681e-4611-a659-3287d06f6e49',
# 'created_at': '2024-08-14T13:56:19.065920+00:00'}
# dataset_id = 248
# name = "1ba98c89-6b76-4402-bd7c-4680cb0a0c8b_uavforsat_2017_CFB044_ortho.tif"
# genearte metadata
metadata_res = requests.put(
BASE_URL + f"/datasets/{dataset_id}/metadata",
json={
"dataset_id": dataset_id,
"user_id": user_id,
"name": name,
"platform": "drone",
"authors": "string",
"license": "cc-by",
"aquisition_year": 2017,
"aquisition_month": None,
"aquisition_day": None,
},
headers={"Authorization": f"Bearer {access_token}"},
)
# buid cog
build_cog_res = requests.put(
BASE_URL + f"/datasets/{dataset_id}/force-cog-build",
json={
# "overviews": 8,
# "resolution": 0.04,
# "profile": "jpeg",
# "quality": 75,
# "force_recreate": False,
},
headers={"Authorization": f"Bearer {access_token}"},
)
# build thumbnail
build_thumbnail_res = requests.put(
BASE_URL + f"/datasets/{dataset_id}/build-thumbnail",
json={
# "force_recreate": False,
},
headers={"Authorization": f"Bearer {access_token}"},
)
# print(build_thumbnail_res)
aoi = gpd.read_file(LABELS_FILE, layer="aoi").to_json()
label = gpd.read_file(LABELS_FILE, layer="standing_deadwood").to_json()
aoi_json = json.loads(aoi)
# print("aoi_json:", aoi_json)
# print("aoi:", aoi_json["features"][0]["geometry"])
labels_json = json.loads(label)
# print("aoi:", aoi_json)
print("labels:", labels_json)
aoi_model = PolygonModel(
type="Polygon", coordinates=aoi_json["features"][0]["geometry"]["coordinates"]
)
label_model = MultiPolygonModel(
type="MultiPolygon",
coordinates=[aoi_json["features"][0]["geometry"]["coordinates"]],
)
# print(aoi_model)
# print(label_model)
res_labels = requests.put(
BASE_URL + f"/datasets/{dataset_id}/labels",
json={
# "aoi": {"type": "Polygon", "coordinates": [[[null, null]]]},
# "aoi": aoi_json["features"][0]["geometry"],
"aoi": aoi_model.model_dump_json(),
# "label": {"type": "MultiPolygon", "coordinates": [[[[null, null]]]]},
# "label": labels_json["features"][0]["geometry"],
"label": label_model.model_dump_json(),
"label_source": "visual_interpretation",
"label_quality": 0,
"label_type": "point_observation",
},
headers={"Authorization": f"Bearer {access_token}"},
)
# LABELS_FILE = "/Users/januschvajna-jehle/data/deadwood-example-data/labels_aoi/uavforsat_2017_CFB044_polygons.gpkg"
# LABELS_FILE = 'uavforsat_2017_CFB044_labels.geojson'
@mmaelicke We have problems satisfying the pydantic multipolygon model. Any suggestions?
Can't have a look right now. Have a look at migrate.py which does the same thing. In multipolygons you need to get the amount of braces right. Maybe that's the problem
Iwill look into it tomorrow
The library parses the dict locally fine, but when sending the exact same data structure to the server, it fails.
Attaching the .gpkg that we used for testing in the above code (need to unzip). uavforsat_2017_CFB044_ortho_polygons.zip
Is there an error message?
The lines you used last create a JSON-encoded string (with model_dump_json), I would suggest to use model_dump
which creates a dict. The json argument of httpx HTTP methods will JSON-encode them again and I think you end up with a GeoJSON geometry that was encoded twice, which the API can't parse anymore.
I haven't tried that, but I can try it myself later if you want. I am just not sure, when I have time for that today...
I think the error message was "Method not allowed" or something "not allowed".
@JesJehle can you send the error? I dont have the code.
@mmaelicke @JesJehle if you send me the .env somehow, then I can reproduce (or fix) the error from my machine.
error
I think the error message was "Method not allowed" or something "not allowed".
@JesJehle can you send the error? I dont have the code.
error is: api-1 | INFO: 192.168.65.1:60103 - "PUT /datasets/267/labels HTTP/1.1" 405 Method Not Allowed
Received the keys. Debugging this on Tuesday, next week.
error
I think the error message was "Method not allowed" or something "not allowed". @JesJehle can you send the error? I dont have the code.
error is: api-1 | INFO: 192.168.65.1:60103 - "PUT /datasets/267/labels HTTP/1.1" 405 Method Not Allowed
Did not see it right away: The PUT HTTP verb is not allowed on the labels route. You need to use a POST verb. The reason is that this route is not idempotent. That means, calling /label
twice will result in two label datasets. A PUT would be translated to an upsert, so calling for example the /metadata
route twice, would result in an update of the exising metadata.
So if you use requests.post
for the labels, everything should be fine.
Thanks all routes work now! Will do the processing tomorrow.
fun story: I deployed the docker container on our infrastructure for the processing this afternoon and this locked everyone out of the server. So that was a fun walk of shame to the server room :) turns out the default docker ip range was identical with the ip range through which we access the server, creating conflicts...
Yeah, the ICEs used to use the same ip range so I was wondering for ages why my tools only worked at home and not on the way to work :)
So I will emty the database and remove the old data?
Removing the database means emptying all supabase with names v1_* ? I could do that myself, would make the upload process easier as I need to make some more tests.
Yeah emtying the tables and removing the associated files from the storage server.
I just did that, meaning as long as you test on a local system, the storage server stays empty. Then we can copy the processed files after you finished. If you test the live API at data.deadtrees.earth/api/v1 I need to remove the testing files from the server again.
okay sounds good
v1_cogs still has data, that can also be emptied, right?
yes. You can now empty and re-fill the tables as you wish until you start the actual processing
I will be have a look into Github and my mails later today to see if there are still issues left you need me to solve.
Thanks. No rush. Tomorrow is also fine. I also have other things on my todolist :)
For some reason I cannot delete the rows in v1_datasets because of some foreign key constraint. Is that intended? All other v1 tables are empty.
Works now. I dont know why.
For reference, the summary of changes until now:
project_id
datatype changed from int to stringlabel
field in supabase table v1_labels
is now nullable. Because there can be orthophotos that are "labeled", but there is no deadwood. The AOI remains mandatory. Thanks for that! Good List. I will take care from API-side that these changes are implemented on the main branch correctly after my vacation
Nach ziemlich genau einer Stunde Upload ist das hier passiert:
api-1 | Traceback (most recent call last):
api-1 | File "/usr/local/lib/python3.12/site-packages/uvicorn/protocols/http/httptools_impl.py", line 401, in run_asgi
api-1 | result = await app( # type: ignore[func-returns-value]
api-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
api-1 | File "/usr/local/lib/python3.12/site-packages/uvicorn/middleware/proxy_headers.py", line 70, in __call__
api-1 | return await self.app(scope, receive, send)
api-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
api-1 | File "/usr/local/lib/python3.12/site-packages/fastapi/applications.py", line 1054, in __call__
api-1 | await super().__call__(scope, receive, send)
api-1 | File "/usr/local/lib/python3.12/site-packages/starlette/applications.py", line 123, in __call__
api-1 | await self.middleware_stack(scope, receive, send)
api-1 | File "/usr/local/lib/python3.12/site-packages/starlette/middleware/errors.py", line 186, in __call__
api-1 | raise exc
api-1 | File "/usr/local/lib/python3.12/site-packages/starlette/middleware/errors.py", line 164, in __call__
api-1 | await self.app(scope, receive, _send)
api-1 | File "/usr/local/lib/python3.12/site-packages/starlette/middleware/cors.py", line 85, in __call__
api-1 | await self.app(scope, receive, send)
api-1 | File "/usr/local/lib/python3.12/site-packages/starlette/middleware/exceptions.py", line 65, in __call__
api-1 | await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
api-1 | File "/usr/local/lib/python3.12/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
api-1 | raise exc
api-1 | File "/usr/local/lib/python3.12/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
api-1 | await app(scope, receive, sender)
api-1 | File "/usr/local/lib/python3.12/site-packages/starlette/routing.py", line 754, in __call__
api-1 | await self.middleware_stack(scope, receive, send)
api-1 | File "/usr/local/lib/python3.12/site-packages/starlette/routing.py", line 774, in app
api-1 | await route.handle(scope, receive, send)
api-1 | File "/usr/local/lib/python3.12/site-packages/starlette/routing.py", line 295, in handle
api-1 | await self.app(scope, receive, send)
api-1 | File "/usr/local/lib/python3.12/site-packages/starlette/routing.py", line 77, in app
api-1 | await wrap_app_handling_exceptions(app, request)(scope, receive, send)
api-1 | File "/usr/local/lib/python3.12/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
api-1 | raise exc
api-1 | File "/usr/local/lib/python3.12/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
api-1 | await app(scope, receive, sender)
api-1 | File "/usr/local/lib/python3.12/site-packages/starlette/routing.py", line 74, in app
api-1 | response = await f(request)
api-1 | ^^^^^^^^^^^^^^^^
api-1 | File "/usr/local/lib/python3.12/site-packages/fastapi/routing.py", line 278, in app
api-1 | raw_response = await run_endpoint_function(
api-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
api-1 | File "/usr/local/lib/python3.12/site-packages/fastapi/routing.py", line 191, in run_endpoint_function
api-1 | return await dependant.call(**values)
api-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
api-1 | File "/app/src/routers/upload.py", line 80, in upload_geotiff
api-1 | user = verify_token(token)
api-1 | ^^^^^^^^^^^^^^^^^^^
api-1 | File "/app/src/supabase.py", line 50, in verify_token
api-1 | response = client.auth.get_user(jwt)
api-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^
api-1 | File "/usr/local/lib/python3.12/site-packages/gotrue/_sync/gotrue_client.py", line 580, in get_user
api-1 | return self._request("GET", "user", jwt=jwt, xform=parse_user_response)
api-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
api-1 | File "/usr/local/lib/python3.12/site-packages/gotrue/_sync/gotrue_base_api.py", line 123, in _request
api-1 | raise handle_exception(e)
api-1 | gotrue.errors.AuthApiError: invalid JWT: unable to parse or verify signature, token has invalid claims: token is expired
Any way to extend the lifetime of the token? Could not immediately find that.
My quick fix is to get a new token before each request...
Not easily. Die supabase Python Lib ist ziemlicher Mist und crasht, wenn man autorefresh probiert. Immer neu einloggen ist gut. Du kannst auch einen Timer auf 50min setzen
Apart from this hickup, the processing is running fine. 125 done, 1150 to go. ETA in 30h.
gotrue.errors.AuthApiError: Request rate limit reached
.... how can I circumvent this?
Not possible. Only Option is to decrease the numbers of Logins from per request to ie every 50 minutes.
Oh man ok. Is there a concrete number for the rate-limit?
I think it's 3 or 4 per hour. We can only customize if we use our own smtp server
thanks. smpt? how to does e-mail play role here?
The whole supabase auth provider is one service. You can only customize any settings if you provide your own smtp server. Many parts of the service rely on mail ie for login, 2fa, otp, reset etc
The COG generation process it stuck at one specific image and has been at 100% for 1h30min already . Other tifs of similar size are processed just fine within minutes.
Here is the tif: https://cloud.scadsai.uni-leipzig.de/index.php/s/e2ZapJy72PoH22Q
EDIT: with 100% I meant the CPU util
I think it would make sense to move some of the data to the file server. If we did this in batches, we could check for possible errors. I think this is better than waiting until all the files have been processed and finding out that some of them are broken.
@cmosig @mmaelicke what do you think?
You mean the cogs to check if the visualization works?
I'd close this issue for now and open new more targeted issues if there are any.
@mmaelicke
Do you need SSH access to our infra or is SFTP enough? Either is fine, just need to know.