earthcube / scheduler

Scheduling approaches related to gleaner tooling
Apache License 2.0
0 stars 0 forks source link

added summarize #48

Closed ylyangtw closed 1 year ago

ylyangtw commented 1 year ago

image

summarize would fail if the source exists in the summary namespace. For testing, I clear the summary namespace first using this sqarql:

DELETE ?s ?p ?o  WHERE { ?s ?p ?o .FILTER regex(str(?s), "iris") .}
valentinedwv commented 1 year ago

(ignore. this get's rid of the entire namespace)

In manage blaze graph, there are delete and create namespace, was setup for integration testing, but think it should work.

https://earthcube.github.io/earthcube_utilities/earthcube_utilities/earthcube_utilities_code/#ec.graph.manageGraph.ManageBlazegraph

valentinedwv commented 1 year ago

If there is an issue with files loading to the graph, Then We should probably look to save the file to s3... then load to the graph.

We could better test the loading issues. aka, how can we test this, and catch this, and where should it be done (dagster/ec_utils)

Then maybe the saved becomes an actual asset, at some point

image

summarize would fail if the source exists in the summary namespace. For testing, I clear the summary namespace first using this sqarql:

DELETE ?s ?p ?o  WHERE { ?s ?p ?o .FILTER regex(str(?s), "iris") .}
valentinedwv commented 1 year ago

This issue as how to push a file in s3 to a graph.

https://github.com/earthcube/scheduler/issues/64

ylyangtw commented 1 year ago

If there is an issue with files loading to the graph, Then We should probably look to save the file to s3... then load to the graph.

We could better test the loading issues. aka, how can we test this, and catch this, and where should it be done (dagster/ec_utils)

Then maybe the saved becomes an actual asset, at some point

image summarize would fail if the source exists in the summary namespace. For testing, I clear the summary namespace first using this sqarql:

DELETE ?s ?p ?o  WHERE { ?s ?p ?o .FILTER regex(str(?s), "iris") .}

Agree!

ylyangtw commented 1 year ago

image if loading to graph failed, it will upload the result to s3.

valentinedwv commented 1 year ago

pull and check this. I reworked the code

ylyangtw commented 1 year ago

Thanks @valentinedwv!

Do you maybe know what this can be?

valentinedwv commented 1 year ago

Might need to update a dependency to: https://github.com/earthcube/scheduler/blob/51652704c1a7022c1b61c6a09fe8240c15380b65/dagster/implnets/requirements_code.txt#L18

earthcube-utilities @ git+https://github.com/earthcube/earthcube_utilities@b671efb#subdirectory=earthcube_utilities

Should have changed it in the standard requirements.txt, too.

valentinedwv commented 1 year ago

Try again. Got the triplestore swapped. fixed an upload issue.

ylyangtw commented 1 year ago

Nice! All work