OpenPecha / OCR-Pipelines

1 stars 0 forks source link

fix batch generation #15

Closed eroux closed 1 year ago

eroux commented 1 year ago

My review of yesterday's pull request was a bit too quick and there's something I didn't spot:

https://github.com/OpenPecha/OCR-Pipelines/blob/main/ocr_pipelines/config.py#L21

is not right (although it should be ok for some time), a correct version should be something along the lines of

def s3_key_exists(s3key) -> bool:
     # to implement

def get_available_batch_id(bdrc_scan_id: str, service:str) -> str:
     nb_iterations = 0
     while nb_iteration < 30:
         candidate = f"batch-{uuid.uuid4().hex[:4]}"
         infos3key = get_s3_path_prefix(bdrc_scan_id, service, batch) # this should be implemented too
         if not s3_key_exists(infos3key):
             return candidate
    throw new "cannot find available batch for "+bdrc_scan_id+"/"+service+" after 30 iterations!"