AllenInstitute / segmentation-labeling-app

Data pipeline and UI for human labeling of putative ROIs from 2p cell segmentations
Other
0 stars 0 forks source link

Feature/botocore upload #115

Closed djkapner closed 4 years ago

djkapner commented 4 years ago

includes #111, replaces #113

Improvements to uploader

example output_json:

{
  "log_level": "ERROR",
  "local_s3_manifest_copy": "logs/20200522142137_s3_manifest.jsonl",
  "failed_uploads": [],
  "succsessful_uploads": [
    {
      "response": {
        "ResponseMetadata": {
          "RequestId": "40764879149C6C5C",
          "HostId": "e7Keu9QRa25sMqLt1p6FOmfjeKzLYA0/LgDpf6idmrAtLsH99YJK0jr6VXNAzi6SBOI0ZLRMyvg=",
          "HTTPStatusCode": 200,
          "HTTPHeaders": {
            "x-amz-id-2": "e7Keu9QRa25sMqLt1p6FOmfjeKzLYA0/LgDpf6idmrAtLsH99YJK0jr6VXNAzi6SBOI0ZLRMyvg=",
            "x-amz-request-id": "40764879149C6C5C",
            "date": "Fri, 22 May 2020 21:21:40 GMT",
            "etag": "\"4be81d812a3cf8256a1e62e2cef3ca2b\"",
            "content-length": "0",
            "server": "AmazonS3"
          },
          "RetryAttempts": 1
        },
        "ETag": "\"4be81d812a3cf8256a1e62e2cef3ca2b\""
      },
      "file_name": "/allen/aibs/informatics/labeling_artifacts/seg_run_id_1042/20200519161125/full_video.webm",
      "key": "parallel-upload-test/20200522142137/865798247_full_video.webm",
      "bucket": "prod.slapp.alleninstitute.org"
    },
   ...
  ]
}

validation

with this branch, 1000 ROIs uploaded in 40 minutes (previously 270 minutes) with no failures (parallelization = 4)

$ python -m slapp.transfers.upload --input_json upload_input.json --output_json ./logs/output.json
WARNING:root:setting Dict fields not supported from argparse
INFO:LabelDataUploader:Requesting 1000 roi manifests from postgres
INFO:LabelDataUploader:bucket destination is s3://prod.slapp.alleninstitute.org/robustness-check/20200522152030
INFO:LabelDataUploader:126 full videos to upload
INFO:botocore.credentials:Found credentials in shared credentials file: ~/.aws/credentials
INFO:botocore.credentials:Found credentials in shared credentials file: ~/.aws/credentials
INFO:botocore.credentials:Found credentials in shared credentials file: ~/.aws/credentials
INFO:botocore.credentials:Found credentials in shared credentials file: ~/.aws/credentials
INFO:botocore.credentials:Found credentials in shared credentials file: ~/.aws/credentials
INFO:LabelDataUploader:wrote local s3 manifest copy logs/20200522152030_s3_manifest.jsonl
INFO:botocore.credentials:Found credentials in shared credentials file: ~/.aws/credentials
INFO:LabelDataUploader:uploaded s3://prod.slapp.alleninstitute.org/robustness-check/20200522152030/manifest.json
INFO:LabelDataUploader:7127 uploads succeeded
INFO:LabelDataUploader:upload job
started : 2020-05-22T15:20:30.070472
ended   : 2020-05-22T16:00:43.965029 
codecov-commenter commented 4 years ago

Codecov Report

Merging #115 into master will increase coverage by 0.68%. The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #115      +/-   ##
==========================================
+ Coverage   91.50%   92.19%   +0.68%     
==========================================
  Files          14       14              
  Lines         683      743      +60     
==========================================
+ Hits          625      685      +60     
  Misses         58       58              
Impacted Files Coverage Δ
slapp/transfers/upload.py 95.68% <100.00%> (+3.38%) :arrow_up:
slapp/transfers/utils.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 82640a8...1964318. Read the comment docs.

kschelonka commented 4 years ago

Missing some code coverage on the cleanup. Can you think of a good way to mock it out?