Closed ErinWeisbart closed 2 years ago
The problem is here
# Create paths dictionary
index_directory_key = index_directory.split(f"s3://{bucket}/")[1]
paginator = s3.get_paginator("list_objects_v2")
pages = paginator.paginate(Bucket=bucket, Prefix=index_directory_key)
try:
for page in pages:
for x in page["Contents"]:
fullpath = x["Key"]
path, filename = fullpath.rsplit("/", 1)
if filename.endswith(".tiff"):
paths[filename] = path
Paginating over an index directory that has all the batch folders in it doesn't play well with the dictionary because each batch/folder has the same list of file names.
I believe this is fixed by changing:
index_directory_key = index_directory.split(f"s3://{bucket}/")[1]
to
index_directory_key = index_directory.split(f"s3://{bucket}/")[1] + plate_id
I don't think that fix will work, because you're just adding the plate onto whatever it is.
What are you passing in as the index_directory? I would assume it would be the plate folder, not the batch folder, can you confirm?
Oh drat. I had set it to index_directory = f"s3://{bucket}/projects/{project_name}/{batch}/images/"
to most closely match what we usually pass, but I got muddled in mapping EFS to S3 and what we usually pass ends at the second 'images' ~/efs/${PROJECT_NAME}/workspace/images/${BATCH_ID}/${PLATE_ID}/Images
.
So never mind all this...
I just have to change what I'm passing to index_directory = f"s3://{bucket}/projects/{project_name}/{batch}/images/{plate}"
Triggering packaged pe2loaddata using search_subdirectories=True and "s3" in index_file.
PathName_* are not being made correctly. It seems to use the last plate in the folder not the plate that is being passed in.
(Metadata_Plate, FileName_Illum and PathName_Illum are created correctly with the plate that is passed in.)