Open dominikl opened 1 year ago
Note that this study will have the same caveats as of https://github.com/IDR/idr-metadata/issues/640#issuecomment-1552697868 in terms of channel order. So similar decisions will need to be done in terms of the conversion we want to perform.
Since we've decided to use omero-cli-zarr
for idr0036 https://github.com/IDR/idr-metadata/issues/640#issuecomment-1611108909 we should do the same here...
Going to try on a different machine since pilot-zarr1-dev
and pilot-zarr2-dev
are at capacity...
Update to use https://github.com/ome/omero-cli-zarr/pull/146
$ ssh -A ome-zarr-dev1.openmicroscopy.org
$ conda activate omero_zarr_export
$ pip uninstall omero-cli-zarr
$ pip install git+https://github.com/will-moore/omero-cli-zarr.git@fix_downsample_image_path
...
Successfully installed omero-cli-zarr-0.1.dev451+g983576f
Just use my home dir...
$ df -h ./
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/VolGroup00-root 994G 34G 960G 4% /
Listing all 413 Plate IDs:
6208 4907 4906 6151 6210 6153 6154 6155 4908 6156 6157 6158 6159 4909 6160 6161 4911 6162 6163 6164 6165 6166 6167 4910 6168 6169 4912 6170 4913 6171 6172 4914 6173 6174 6175 6176 6177 6178 6179 4915 4917 4916 4951 4953 4952 4954 4956 4955 4958 4959 4957 6180 4962 4961 4960 4963 4964 4965 4966 4967 4968 4969 4970 4971 4973 4972 4974 4975 4976 4977 4978 6181 6182 6183 6184 6185 6186 4979 4980 6187 6188 6189 6190 6191 6192 6193 6194 4981 6195 6196 6197 6198 6199 6200 4982 6201 6202 6203 6204 6205 6206 6207 4983 4984 4986 4985 4987 4989 4988 4990 4991 4992 4993 4994 4995 4996 4997 4998 4999 5001 5000 5002 5004 5003 5005 5006 5007 5008 5010 5009 5011 5012 5014 5013 5015 5017 5016 5019 5018 5020 5021 5023 5022 5024 5025 5026 5029 5027 5028 5032 5031 5030 5033 5035 5034 5036 5037 5038 5039 5040 5041 5042 5044 5043 5047 5046 5045 5050 5048 5049 5052 5051 5053 5054 5056 5055 5059 5058 5057 5062 5060 5061 5063 5065 5064 5066 5068 5067 5069 5071 5070 5072 5074 5073 5075 5076 5077 5080 5079 5078 5081 5082 5083 5084 5085 5086 5087 5088 5089 5091 5090 5092 5094 5093 5095 5096 5097 5098 5101 5100 5099 5102 5103 5104 5105 5106 5107 5151 5152 5153 5154 5155 5156 5157 5158 5159 5160 5161 5162 5163 5164 5165 5166 5167 5168 5169 5170 5171 5172 5173 5174 5175 5176 5177 5178 5179 5180 5181 5182 5183 5184 5185 5186 5187 5188 5189 5190 5191 5192 5193 5194 5195 5196 5197 5198 5199 5200 5201 5202 5203 5204 5205 5206 5207 5208 5209 5210 5211 5212 5213 5214 5215 5216 5217 5218 5219 5220 5221 5222 5223 5224 5225 5226 5227 5228 5229 5230 5231 5232 5233 5234 5235 5236 5237 5238 5239 5240 5241 5242 5243 5244 5245 5246 5247 5248 5249 5251 5250 5252 5253 5254 5255 5256 5257 5259 5258 5260 5261 5262 5263 5264 5265 5266 5267 5268 5269 5270 5271 5272 5273 5274 5275 5276 5277 5278 5279 5280 5281 5282 5283 5284 5285 5286 5287 5288 5289 5290 5291 5292 5293 5294 5295 5296 5297 5298 5299 5300 5301 5302 5351 5303 5304 5305 5306 5307 5308 5380 5353 5354 5355 5356 5357 5358 5359 5360 5361 5362 5363 5364 5365 5366 5367 5368 5369 5370 5371 5372 5373 5374 5375 5376 5377 5378
start with export of 10 Plates.
screen -S idr0016_ngff
mkdir idr0016 && cd idr0016
omero login. # idr-testing public/public
for id in 6208 4907 4906 6151 6210 6153 6154 6155 4908 6156; do
echo $id;
omero zarr export Plate:$id;
done
3 plates completed so far... Zipping...
for i in */; do zip -mr "${i%/}.zip" "$i"; done
Ooops - forgot to rename from plateID.zarr
to plateName.ome.zarr
as I did for idr0036...
Unzipped each of 3 zips and renamed...
$ ls -lh
total 0
drwxr-xr-x 18 wmoore lsd 180 Jul 10 17:20 24277.ome.zarr
drwxr-xr-x 18 wmoore lsd 180 Jul 10 19:51 24278.ome.zarr
drwxr-xr-x 18 wmoore lsd 180 Jul 10 22:14 24279.ome.zarr
Then zipped again...
Upload failed...
(base) [wmoore@ome-zarr-dev1 bin]$ ./ascp -P33001 -i ../etc/asperaweb_id_dsa.openssh -d ~/idr0016 bsaspera_w@hx-fasp-1.ebi.ac.uk:5f/136e8d-xxxxxx
24277.ome.zarr.zip 95% 5277MB 97.8Mb/s 00:25 ETAPartial Completion: 5414415K bytes transferred in 517 seconds
(85719K bits/sec), in 3 files, 1 directory; 3 files failed.
Session Stop (Error: Session data transfer timeout (server), Session data transfer timeout)
Deleted zips on BioStudies and ran again...
Uploaded 1 zip, then timed-out on next one. Repeated again - for each zip (only uploaded 1 at a time before time-out). Last one uploaded with:
(base) [wmoore@ome-zarr-dev1 bin]$ ./ascp -P33001 -i ../etc/asperaweb_id_dsa.openssh -d ~/idr0016 bsaspera_w@hx-fasp-1.ebi.ac.uk:5f/136e8d-xxxxxxxxxx
24279.ome.zarr.zip 100% 6188MB 83.4Mb/s 08:51
Completed: 6336955K bytes transferred in 532 seconds
(97506K bits/sec), in 1 file, 1 directory.
Testing on s3...
$ aws --endpoint-url https://uk1s3.embassy.ebi.ac.uk s3 mb s3://idr0016
make_bucket: idr0016
$ aws --endpoint-url https://uk1s3.embassy.ebi.ac.uk s3api put-bucket-policy --bucket idr0016 --policy file://policy.json
$ aws --endpoint-url https://uk1s3.embassy.ebi.ac.uk s3api put-bucket-cors --bucket idr0016 --cors-configuration file://cors.json
$ cd /idr0016
$ unzip 24279.ome.zarr.zip && rm 24279.ome.zarr.zip
$ cd
$ ./mc cp -r idr0016/ uk1s3/idr0016/zarr
...79.ome.zarr/P/9/5/3/4/0/0: 6.61 GiB / 6.61 GiB ━━━━━━━━━━━━━━━━━ 24.22 MiB/s 4m39s
Looks good and valid... https://ome.github.io/ome-ngff-validator/?source=https://uk1s3.embassy.ebi.ac.uk/idr0016/zarr/24279.ome.zarr
Above export still running... Start another, also using idr-testing (same session). Use https://github.com/ome/omero-cli-zarr/pull/147 so we don't have to manually rename every plate after export...
batch2 of 100 plates...
screen -S idr0016_export2
conda activate omero_zarr_export
pip install git+https://github.com/will-moore/omero-cli-zarr.git@name_option
cd idr0016
mkdir batch2
cd batch2
for id in 6157 6158 6159 4909 6160 6161 4911 6162 6163 6164 6165 6166 6167 4910 6168 6169 4912 6170 4913 6171 6172 4914 6173 6174 6175 6176 6177 6178 6179 4915 4917 4916 4951 4953 4952 4954 4956 4955 4958 4959 4957 6180 4962 4961 4960 4963 4964 4965 4966 4967 4968 4969 4970 4971 4973 4972 4974 4975 4976 4977 4978 6181 6182 6183 6184 6185 6186 4979 4980 6187 6188 6189 6190 6191 6192 6193 6194 4981 6195 6196 6197 6198 6199 6200 4982 6201 6202 6203 6204 6205 6206 6207 4983 4984 4986 4985 4987 4989 4988 4990; do
echo $id;
omero zarr export Plate:$id --name_by name;
done
Lost connection with IDR part-way through initial 10 plates (and batch2 of 100 plates)... Restarted, repeating the part-exported plate and the remaining 3 of 10...
ssh -A ome-zarr-dev1.openmicroscopy.org
screen -S idr0016_export
for id in 6154 6155 4908 6156; do
echo $id;
omero zarr export Plate:$id;
done
start from scratch for all 100 plates (only part of 1 plate done so far)
cd batch2
for id in 6157 6158 6159 4909 6160 6161 4911 6162 6163 6164 6165 6166 6167 4910 6168 6169 4912 6170 4913 6171 6172 4914 6173 6174 6175 6176 6177 6178 6179 4915 4917 4916 4951 4953 4952 4954 4956 4955 4958 4959 4957 6180 4962 4961 4960 4963 4964 4965 4966 4967 4968 4969 4970 4971 4973 4972 4974 4975 4976 4977 4978 6181 6182 6183 6184 6185 6186 4979 4980 6187 6188 6189 6190 6191 6192 6193 6194 4981 6195 6196 6197 6198 6199 6200 4982 6201 6202 6203 6204 6205 6206 6207 4983 4984 4986 4985 4987 4989 4988 4990; do
echo $id;
omero zarr export Plate:$id --name_by name;
done
Remaining of the first batch of 10 plates exported OK.
Renamed to plateName.ome.zarr
using e.g. https://idr.openmicroscopy.org/webclient/?show=plate-6210 to lookup...
(base) [wmoore@ome-zarr-dev1 idr0016]$ ls -alh
total 4.0K
drwxr-xr-x 11 wmoore lsd 161 Jul 11 21:13 .
drwx------ 30 wmoore lsd 4.0K Jul 11 13:00 ..
drwxr-xr-x 18 wmoore lsd 180 Jul 10 22:14 24279.ome.zarr
drwxr-xr-x 18 wmoore lsd 180 Jul 11 21:00 4908.zarr
drwxr-xr-x 18 wmoore lsd 180 Jul 11 00:59 6151.zarr
drwxr-xr-x 18 wmoore lsd 180 Jul 11 06:23 6153.zarr
drwxr-xr-x 18 wmoore lsd 180 Jul 11 15:28 6154.zarr
drwxr-xr-x 18 wmoore lsd 180 Jul 11 18:14 6155.zarr
drwxr-xr-x 18 wmoore lsd 180 Jul 11 23:52 6156.zarr
drwxr-xr-x 18 wmoore lsd 180 Jul 11 03:47 6210.zarr
drwxr-xr-x 7 wmoore lsd 116 Jul 12 00:47 batch2
(base) [wmoore@ome-zarr-dev1 idr0016]$ mv 4908.zarr 24297.ome.zarr
(base) [wmoore@ome-zarr-dev1 idr0016]$ mv 6151.zarr 24280.ome.zarr
(base) [wmoore@ome-zarr-dev1 idr0016]$ mv 6153.zarr 24294.ome.zarr
(base) [wmoore@ome-zarr-dev1 idr0016]$ mv 6154.zarr 24295.ome.zarr
(base) [wmoore@ome-zarr-dev1 idr0016]$ mv 6155.zarr 24296.ome.zarr
(base) [wmoore@ome-zarr-dev1 idr0016]$ mv 6156.zarr 24300.ome.zarr
(base) [wmoore@ome-zarr-dev1 idr0016]$ mv 6210.zarr 24293.ome.zarr
moved into batch1
dir to zip...
$ cd batch1/
(base) [wmoore@ome-zarr-dev1 batch1]$ ls -lh
total 0
drwxr-xr-x 18 wmoore lsd 180 Jul 10 22:14 24279.ome.zarr
drwxr-xr-x 18 wmoore lsd 180 Jul 11 00:59 24280.ome.zarr
drwxr-xr-x 18 wmoore lsd 180 Jul 11 03:47 24293.ome.zarr
drwxr-xr-x 18 wmoore lsd 180 Jul 11 06:23 24294.ome.zarr
drwxr-xr-x 18 wmoore lsd 180 Jul 11 15:28 24295.ome.zarr
drwxr-xr-x 18 wmoore lsd 180 Jul 11 18:14 24296.ome.zarr
drwxr-xr-x 18 wmoore lsd 180 Jul 11 21:00 24297.ome.zarr
drwxr-xr-x 18 wmoore lsd 180 Jul 11 23:52 24300.ome.zarr
(base) [wmoore@ome-zarr-dev1 batch1]$ for i in */; do zip -mr "${i%/}.zip" "$i"; done
Current status of batch2 export of 100 plates... Just under 3 hours per plate...
(base) [wmoore@ome-zarr-dev1 ~]$ ls -lh ~/idr0016/batch2
total 0
drwxr-xr-x 18 wmoore lsd 180 Jul 11 16:21 24301.ome.zarr
drwxr-xr-x 18 wmoore lsd 180 Jul 11 19:01 24302.ome.zarr
drwxr-xr-x 18 wmoore lsd 180 Jul 11 21:49 24303.ome.zarr
drwxr-xr-x 18 wmoore lsd 180 Jul 12 00:37 24304.ome.zarr
drwxr-xr-x 18 wmoore lsd 180 Jul 12 03:13 24305.ome.zarr
drwxr-xr-x 12 wmoore lsd 126 Jul 12 04:49 24306.ome.zarr
Space is enough for 100 plates (approx 6.6 GB per plate)...
$ df -h ./
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/VolGroup00-root 994G 139G 855G 14% /
Zipping progress of batch1 - about 25 minutes per plate...
(base) [wmoore@ome-zarr-dev1 ~]$ ls -lh ~/idr0016/batch1
total 33G
-rw-r--r-- 1 wmoore lsd 6.1G Jul 12 03:28 24280.ome.zarr.zip
-rw-r--r-- 1 wmoore lsd 6.1G Jul 12 03:53 24293.ome.zarr.zip
-rw-r--r-- 1 wmoore lsd 6.1G Jul 12 04:17 24294.ome.zarr.zip
-rw-r--r-- 1 wmoore lsd 6.0G Jul 12 04:42 24295.ome.zarr.zip
drwxr-xr-x 18 wmoore lsd 180 Jul 11 18:14 24296.ome.zarr
drwxr-xr-x 18 wmoore lsd 180 Jul 11 21:00 24297.ome.zarr
drwxr-xr-x 18 wmoore lsd 180 Jul 11 23:52 24300.ome.zarr
-rw------- 1 wmoore lsd 4.2G Jul 12 05:03 ziLKSu9D
Upload remaining 7 zips of batch1. Timeout failure again...
(base) [wmoore@ome-zarr-dev1 bin]$ ./ascp -P33001 -i ../etc/asperaweb_id_dsa.openssh -d ~/idr0016/batch1/idr0016 bsaspera_w@hx-fasp-1.ebi.ac.uk:5f/136e8d-xxxxxx
24280.ome.zarr.zip 0% 562MB 95.1Mb/s - error -
Error 35 [Data transfer timeout]
Partial Completion: 588018K bytes transferred in 120 seconds
(40105K bits/sec), in 7 files, 1 directory; 7 files failed.
Session Stop (Error: Session data transfer timeout)
Move 7 zips to minio objectstore...
(base) [wmoore@ome-zarr-dev1 idr]$ mv ~/idr0016/batch1/idr0016/* /uod/idr/objectstore/minio/idr/idr0016/
These are then available to download from e.g. https://minio-dev.openmicroscopy.org/idr/idr0016/24280.ome.zarr.zip
Want to use idr-ftp
machine to aspera the data to BioStudies (as we did for idr0012)...
Try to rsync to ssh ome-zarr-dev1.openmicroscopy.org
from there but can't ssh...
(base) [wmoore@idrftp-ftp ~]$ ssh ome-zarr-dev1.openmicroscopy.org
ssh: Could not resolve hostname ome-zarr-dev1.openmicroscopy.org: Name or service not known
Try to use the minio data available above...
Install goofys on idr-ftp
to copy data there.
$ cd
$ sudo wget https://github.com/kahing/goofys/releases/latest/download/goofys
$ sudo chmod +x ./goofys
$ sudo mkdir ./minio
$ sudo ~/goofys --endpoint https://minio-dev.openmicroscopy.org/ -o allow_other idr0012 ./minio
2023/07/12 12:15:51.233904 main.FATAL Unable to mount file system, see syslog for details
Downloaded 7 zips to idr-ftp
machine with....
$ wget https://minio-dev.openmicroscopy.org/idr/idr0016/24294.ome.zarr.zip
etc...
...
$ ls -lh
total 42G
-rw-rw-r--. 1 wmoore wmoore 6.1G Jul 12 02:28 24280.ome.zarr.zip
-rw-rw-r--. 1 wmoore wmoore 6.1G Jul 12 02:53 24293.ome.zarr.zip
-rw-rw-r--. 1 wmoore wmoore 6.1G Jul 12 03:17 24294.ome.zarr.zip
-rw-rw-r--. 1 wmoore wmoore 6.0G Jul 12 03:42 24295.ome.zarr.zip
-rw-rw-r--. 1 wmoore wmoore 6.0G Jul 12 04:10 24296.ome.zarr.zip
-rw-rw-r--. 1 wmoore wmoore 6.1G Jul 12 04:40 24297.ome.zarr.zip
-rw-rw-r--. 1 wmoore wmoore 6.0G Jul 12 05:29 24300.ome.zarr.zip
Upload to BioStudies...
sudo /root/.aspera/cli/bin/ascp -P33001 -i /root/.aspera/cli/etc/asperaweb_id_dsa.openssh -d /data/ngff/idr0016/idr0016/ bsaspera_w@hx-fasp-1.ebi.ac.uk:5f/xxxxxx
Tried to install p7zip
on ome-zarr-dev1
without success...
$ sudo yum install p7zip
Loaded plugins: langpacks, product-id, rhnplugin, search-disabled-repos, subscription-manager
The SSL certificate failed verification.
@dominikl Plates are taking about 3 hours to export. We have exported about 27 / 413 Plates (from idr-testing.openmicroscopy.org).
(386 * 3) / 24 = 48 days This is too long, so we need to speed this up and run on multiple machines, exporting from multiple servers. E.g. idr-testing and idr.openmicroscopy.org.
The first 10 Plates (done) and 100 (now running) leave 303 Plates to follow (or export at the same time elsewhere).
These 303 IDs are:
4991 4992 4993 4994 4995 4996 4997 4998 4999 5001 5000 5002 5004 5003 5005 5006 5007 5008 5010 5009 5011 5012 5014 5013 5015 5017 5016 5019 5018 5020 5021 5023 5022 5024 5025 5026 5029 5027 5028 5032 5031 5030 5033 5035 5034 5036 5037 5038 5039 5040 5041 5042 5044 5043 5047 5046 5045 5050 5048 5049 5052 5051 5053 5054 5056 5055 5059 5058 5057 5062 5060 5061 5063 5065 5064 5066 5068 5067 5069 5071 5070 5072 5074 5073 5075 5076 5077 5080 5079 5078 5081 5082 5083 5084 5085 5086 5087 5088 5089 5091 5090 5092 5094 5093 5095 5096 5097 5098 5101 5100 5099 5102 5103 5104 5105 5106 5107 5151 5152 5153 5154 5155 5156 5157 5158 5159 5160 5161 5162 5163 5164 5165 5166 5167 5168 5169 5170 5171 5172 5173 5174 5175 5176 5177 5178 5179 5180 5181 5182 5183 5184 5185 5186 5187 5188 5189 5190 5191 5192 5193 5194 5195 5196 5197 5198 5199 5200 5201 5202 5203 5204 5205 5206 5207 5208 5209 5210 5211 5212 5213 5214 5215 5216 5217 5218 5219 5220 5221 5222 5223 5224 5225 5226 5227 5228 5229 5230 5231 5232 5233 5234 5235 5236 5237 5238 5239 5240 5241 5242 5243 5244 5245 5246 5247 5248 5249 5251 5250 5252 5253 5254 5255 5256 5257 5259 5258 5260 5261 5262 5263 5264 5265 5266 5267 5268 5269 5270 5271 5272 5273 5274 5275 5276 5277 5278 5279 5280 5281 5282 5283 5284 5285 5286 5287 5288 5289 5290 5291 5292 5293 5294 5295 5296 5297 5298 5299 5300 5301 5302 5351 5303 5304 5305 5306 5307 5308 5380 5353 5354 5355 5356 5357 5358 5359 5360 5361 5362 5363 5364 5365 5366 5367 5368 5369 5370 5371 5372 5373 5374 5375 5376 5377 5378
NB: when creating conda
env for exporting, use pip install git+https://github.com/will-moore/omero-cli-zarr.git@name_option
and run with omero zarr export Plate:$id --name_by name;
as described above.
Current export is running at
$ ssh -A ome-zarr-dev1.openmicroscopy.org
(base) [wmoore@ome-zarr-dev1 ~]$ ls -lh /lifesci/groups/jrs/wmoore/idr0016/batch2
total 4.0K
drwxr-xr-x 18 wmoore lsd 180 Jul 13 04:47 24320.ome.zarr
drwxr-xr-x 18 wmoore lsd 180 Jul 13 07:21 24321.ome.zarr
drwxr-xr-x 18 wmoore lsd 180 Jul 13 09:50 24352.ome.zarr
drwxr-xr-x 3 wmoore lsd 45 Jul 13 10:00 24357.ome.zarr
drwxr-xr-x 10 wmoore lsd 4.0K Jul 13 09:43 idr0016
Getting data off that machine is hard as aspera times-out badly and I can't install p7zip (more reasons to run other batches elsewhere). But the exported data can sit there till I'm back. Should be enough space for over 100 Plates (6.6 GB each)
(base) [wmoore@ome-zarr-dev1 ~]$ df -h ./
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/VolGroup00-root 994G 170G 824G 18% /
Moving zips off ome-zarr-dev1
...
(base) [wmoore@ome-zarr-dev1 ~]$ cd idr0016/batch2/idr0016/
(base) [wmoore@ome-zarr-dev1 idr0016]$ ls *.zip
24301.ome.zarr.zip 24302.ome.zarr.zip 24303.ome.zarr.zip 24304.ome.zarr.zip 24305.ome.zarr.zip 24306.ome.zarr.zip 24307.ome.zarr.zip
(base) [wmoore@ome-zarr-dev1 idr0016]$ mv *.zip /uod/idr/objectstore/minio/idr/idr0016/
Then on idr-ftp
...
$ cd /data/ngff/idr0016/
$ for i in 24301.ome.zarr.zip 24302.ome.zarr.zip 24303.ome.zarr.zip 24304.ome.zarr.zip 24305.ome.zarr.zip 24306.ome.zarr.zip 24307.ome.zarr.zip; do wget "https://minio-dev.openmicroscopy.org/idr/idr0016/${i%}"; done;
From there, upload to BioStudies...
sudo /root/.aspera/cli/bin/ascp -P33001 -i /root/.aspera/cli/etc/asperaweb_id_dsa.openssh -d /data/ngff/idr0016/idr0016/ bsaspera_w@hx-fasp-1.ebi.ac.uk:5f/xxxxxxxx
Repeated above steps with 7 more zips... 24308.ome.zarr.zip 24309.ome.zarr.zip 24310.ome.zarr.zip 24311.ome.zarr.zip 24312.ome.zarr.zip 24313.ome.zarr.zip 24319.ome.zarr.zip
Export, zip and upload now running on pilot-idr0136 and pilot-idr0142. Ids which have been exported or are still in progress:
pilot-idr0136:
5303 5304 5305 5306 5307 5308 5380 5353 5354 5355 5356 5357 5358 5359 5360 5361 5362 5363 5364 5365 5366 5367 5368 5369 5370 5371 5372 5373 5374 5375 5376 5377 5378
5214 5215 5216 5217 5218 5219 5220 5221 5222 5223 5224 5225 5226 5227 5228 5229 5230 5231 5232 5233 5234 5235 5236 5237 5238 5239 5240 5241 5242 5243
5184 5185 5186 5187 5188 5189 5190 5191 5192 5193 5194 5195 5196 5197 5198 5199 5200 5201 5202 5203 5204 5205 5206 5207 5208 5209 5210 5211 5212 5213
5081 5082 5083 5084 5085 5086 5087 5088 5089 5091 5090 5092 5094 5093 5095 5096 5097 5098 5101 5100 5099 5102 5103 5104 5105 5106 5107 5151 5152 5153
4991 4992 4993 4994 4995 4996 4997 4998 4999 5001 5000 5002 5004 5003 5005 5006 5007 5008 5010 5009 5011 5012 5014 5013 5015 5017 5016 5019 5018 5020 5021
pilot-idr0142:
5274 5275 5276 5277 5278 5279 5280 5281 5282 5283 5284 5285 5286 5287 5288 5289 5290 5291 5292 5293 5294 5295 5296 5297 5298 5299 5300 5301 5302 5351
5244 5245 5246 5247 5248 5249 5251 5250 5252 5253 5254 5255 5256 5257 5259 5258 5260 5261 5262 5263 5264 5265 5266 5267 5268 5269 5270 5271 5272 5273
5154 5155 5156 5157 5158 5159 5160 5161 5162 5163 5164 5165 5166 5167 5168 5169 5170 5171 5172 5173 5174 5175 5176 5177 5178 5179 5180 5181 5182 5183
5052 5051 5053 5054 5056 5055 5059 5058 5057 5062 5060 5061 5063 5065 5064 5066 5068 5067 5069 5071 5070 5072 5074 5073 5075 5076 5077 5080 5079 5078
5023 5022 5024 5025 5026 5029 5027 5028 5032 5031 5030 5033 5035 5034 5036 5037 5038 5039 5040 5041 5042 5044 5043 5047 5046 5045 5050 5048 5049
Ids left to do:
---
Note: Running in conda env:
conda create -n "myenv" python=3.9.12 ipython
conda activate myenv
conda install -c ome omero-py
pip install git+https://github.com/will-moore/omero-cli-zarr.git@name_option
I converted all the remaining Ids from https://github.com/IDR/idr-metadata/issues/638#issuecomment-1633863222 and uploaded to biostudies. But I'm not sure if there are still other zips somewhere which have not been uploaded yet. Need @will-moore to check again. I also forgot to check if there was already a idr0016_files.tsv, so I might have overwritten it with my idr0016_files.tsv (which only contains the zips for the IDs mentioned above).
Move zips...
ssh -A ome-zarr-dev1.openmicroscopy.org
cd idr0016/batch2/idr0016/
$ ls *.zip
24320.ome.zarr.zip 24321.ome.zarr.zip 24352.ome.zarr.zip 24357.ome.zarr.zip 24507.ome.zarr.zip 24508.ome.zarr.zip 24509.ome.zarr.zip 24512.ome.zarr.zip
mv *.zip /uod/idr/objectstore/minio/idr/idr0016/
ssh idr-ftp.openmicroscopy.org
cd /data/ngff/idr0016/idr0016
screen -S idr0016_wget
for i in 24320.ome.zarr.zip 24321.ome.zarr.zip 24352.ome.zarr.zip 24357.ome.zarr.zip 24507.ome.zarr.zip 24508.ome.zarr.zip 24509.ome.zarr.zip 24512.ome.zarr.zip; do wget "https://minio-dev.openmicroscopy.org/idr/idr0016/${i%}"; done;
sudo /root/.aspera/cli/bin/ascp -P33001 -i /root/.aspera/cli/etc/asperaweb_id_dsa.openssh -d /data/ngff/idr0016/idr0016/ bsaspera_w@hx-fasp-1.ebi.ac.uk:5f/xxxxxx
Zipping up 79 zarrs exported above...
on ome-zarr-dev1
$ screen -r idr0016_zip
$ cd idr0016/batch2/
$ ls
24514.ome.zarr 24562.ome.zarr 24586.ome.zarr 24596.ome.zarr 24619.ome.zarr 24636.ome.zarr 24644.ome.zarr 24654.ome.zarr 24667.ome.zarr 24732.ome.zarr
24515.ome.zarr 24563.ome.zarr 24588.ome.zarr 24602.ome.zarr 24623.ome.zarr 24637.ome.zarr 24645.ome.zarr 24655.ome.zarr 24683.ome.zarr 24733.ome.zarr
24516.ome.zarr 24564.ome.zarr 24590.ome.zarr 24604.ome.zarr 24624.ome.zarr 24638.ome.zarr 24646.ome.zarr 24656.ome.zarr 24684.ome.zarr 24734.ome.zarr
24517.ome.zarr 24565.ome.zarr 24591.ome.zarr 24605.ome.zarr 24625.ome.zarr 24639.ome.zarr 24647.ome.zarr 24657.ome.zarr 24685.ome.zarr 24735.ome.zarr
24518.ome.zarr 24566.ome.zarr 24592.ome.zarr 24609.ome.zarr 24631.ome.zarr 24640.ome.zarr 24648.ome.zarr 24661.ome.zarr 24687.ome.zarr 24736.ome.zarr
24523.ome.zarr 24583.ome.zarr 24593.ome.zarr 24611.ome.zarr 24633.ome.zarr 24641.ome.zarr 24651.ome.zarr 24663.ome.zarr 24688.ome.zarr 24739.ome.zarr
24525.ome.zarr 24584.ome.zarr 24594.ome.zarr 24617.ome.zarr 24634.ome.zarr 24642.ome.zarr 24652.ome.zarr 24664.ome.zarr 24726.ome.zarr
24560.ome.zarr 24585.ome.zarr 24595.ome.zarr 24618.ome.zarr 24635.ome.zarr 24643.ome.zarr 24653.ome.zarr 24666.ome.zarr 24731.ome.zarr
$ for i in */; do zip -mr "${i%/}.zip" "$i"; done
$ screen -r idr0016_export
$ cd idr0016/batch2/
$ ls *.zip | wc
78 78 1482
$ mv *.zip /uod/idr/objectstore/minio/idr/idr0016/
on idr-ftp
for i in 24514.ome.zarr.zip 24564.ome.zarr.zip 24592.ome.zarr.zip 24617.ome.zarr.zip 24636.ome.zarr.zip 24646.ome.zarr.zip 24661.ome.zarr.zip 24726.ome.zarr.zip 24515.ome.zarr.zip 24565.ome.zarr.zip 24593.ome.zarr.zip 24618.ome.zarr.zip 24637.ome.zarr.zip 24647.ome.zarr.zip 24663.ome.zarr.zip 24731.ome.zarr.zip 24516.ome.zarr.zip 24566.ome.zarr.zip 24594.ome.zarr.zip 24619.ome.zarr.zip 24638.ome.zarr.zip 24648.ome.zarr.zip 24664.ome.zarr.zip 24732.ome.zarr.zip 24517.ome.zarr.zip 24583.ome.zarr.zip 24595.ome.zarr.zip 24623.ome.zarr.zip 24639.ome.zarr.zip 24651.ome.zarr.zip 24666.ome.zarr.zip 24733.ome.zarr.zip 24518.ome.zarr.zip 24584.ome.zarr.zip 24596.ome.zarr.zip 24624.ome.zarr.zip 24640.ome.zarr.zip 24652.ome.zarr.zip 24667.ome.zarr.zip 24734.ome.zarr.zip 24523.ome.zarr.zip 24585.ome.zarr.zip 24602.ome.zarr.zip 24625.ome.zarr.zip 24641.ome.zarr.zip 24653.ome.zarr.zip 24683.ome.zarr.zip 24735.ome.zarr.zip 24525.ome.zarr.zip 24586.ome.zarr.zip 24604.ome.zarr.zip 24631.ome.zarr.zip 24642.ome.zarr.zip 24654.ome.zarr.zip 24684.ome.zarr.zip 24736.ome.zarr.zip 24560.ome.zarr.zip 24588.ome.zarr.zip 24605.ome.zarr.zip 24633.ome.zarr.zip 24643.ome.zarr.zip 24655.ome.zarr.zip 24685.ome.zarr.zip 24739.ome.zarr.zip 24562.ome.zarr.zip 24590.ome.zarr.zip 24609.ome.zarr.zip 24634.ome.zarr.zip 24644.ome.zarr.zip 24656.ome.zarr.zip 24687.ome.zarr.zip 24563.ome.zarr.zip 24591.ome.zarr.zip 24611.ome.zarr.zip 24635.ome.zarr.zip 24645.ome.zarr.zip 24657.ome.zarr.zip 24688.ome.zarr.zip; do wget "https://minio-dev.openmicroscopy.org/idr/idr0016/${i%}"; done;
Uploading 78 zips...
$ sudo /root/.aspera/cli/bin/ascp -P33001 -i /root/.aspera/cli/etc/asperaweb_id_dsa.openssh -d /data/ngff/idr0016/idr0016/ bsaspera_w@hx-fasp-1.ebi.ac.uk:5f/xxxxxxxx
All zips have uploaded to BioStudies now, so we have all 413 zarr.zips there. Just sorting by size, I see that most are about 5GB but some are virtually empty.
Need to re-export the all smallest 344 bytes
zarrs.
It turns out that these all have empty A1
well, so the export failed with bug that is fixed at https://github.com/ome/omero-cli-zarr/pull/147/commits/1d726264f44e2b6cb833bcc23603e2b7e56121b5
Other smaller zips are due to plates having a low number of Wells (don't need to re-export)
Manually looking up IDs for the empty plates...
5259, 5258, 5260, 5261
Export on idr-ftp
...
conda activate omero_zarr_export
pip freeze | grep zarr
ome-zarr==0.8.0
omero-cli-zarr @ git+https://github.com/will-moore/omero-cli-zarr.git@e882a620d575bffdca21144a41bb990ab2039d8e
zarr==2.15.0
pip install -U git+https://github.com/will-moore/omero-cli-zarr.git@name_option
Successfully installed omero-cli-zarr-0.1.dev456+gc73d400
omero login
for id in 5259 5258 5260 5261; do
omero zarr export Plate:$id --name_by name;
done
Zip and upload...
$ sudo /root/.aspera/cli/bin/ascp -P33001 -i /root/.aspera/cli/etc/asperaweb_id_dsa.openssh -d /data/ngff/idr0016/re_export/idr0016/ bsaspera_w@hx-fasp-1.ebi.ac.uk:5f/xxxxxx
26564.ome.zarr.zip 100% 1022MB 480Mb/s 00:20
26569.ome.zarr.zip 100% 219MB 432Mb/s 00:24
26572.ome.zarr.zip 100% 795MB 456Mb/s 00:39
26574.ome.zarr.zip 100% 1282MB 371Mb/s 01:04
These are still quite small compared with other plates, but probably just due to not many Wells being filled.
At https://uk1s3.embassy.ebi.ac.uk/bia-integrator-data/pages/S-BIAD851.html we how have 122 out of 413 filesets "viewable".
Lets just take first 10 for testing...
idr0016/24516.ome.zarr,S-BIAD851/05334862-30d8-4a98-899f-2738a0dfc94d,23576
idr0016/25918.ome.zarr,S-BIAD851/0e4290c9-52ba-418c-ae97-86e5e7a43439,21482
idr0016/26592.ome.zarr,S-BIAD851/0e46a0b5-6257-425d-bb91-1f953ae6c913,21569
idr0016/24638.ome.zarr,S-BIAD851/0ed303e9-ecd5-4945-8e92-59b392e51554,23585
idr0016/24595.ome.zarr,S-BIAD851/0f02c2f2-2ca7-424f-8186-2cbd88903cbb,21263
idr0016/26672.ome.zarr,S-BIAD851/1001629e-4727-4e8f-b741-dd825fb1dd63,21596
idr0016/25569.ome.zarr,S-BIAD851/104f679f-a14a-42f6-97d6-bf9507de606b,21352
idr0016/24617.ome.zarr,S-BIAD851/1110cfdc-f807-4464-8342-6716cad0fd07,21270
idr0016/25707.ome.zarr,S-BIAD851/11d072c0-112c-4fb2-9170-6009ca3f7bbc,21402
idr0016/25576.ome.zarr,S-BIAD851/11f72eb1-ab8c-4765-8cf4-660556471ac5,21359
Found prefix demo_2/2017-08/16 // 02-26-49.136 for fileset 23576
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2017-08/16/02-26-49.136
Creating dir at /data/OMERO/ManagedRepository/demo_2/2017-08/16/02-26-49.136_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/2017-08/16/02-26-49.136_mkngff/05334862-30d8-4a98-899f-2738a0dfc94d.zarr -> /bia-integrator-data/S-BIAD851/05334862-30d8-4a98-899f-2738a0dfc94d/05334862-30d8-4a98-899f-2738a0dfc94d.zarr
...
Was taking a long time to mkngff
so stopped after 1st 2 complete...
Taking over 2 hours per Fileset.
Ran just those 2...
$ psql -U omero -d idr -h $DBHOST -f 23576.sql
BEGIN
mkngff_fileset
----------------
5287517
(1 row)
COMMIT
(mkngff) bash-4.2$ psql -U omero -d idr -h $DBHOST -f 21482.sql
BEGIN
mkngff_fileset
----------------
5287518
(1 row)
Manual psql since we didn't have https://github.com/IDR/omero-mkngff/pull/8
idr=> UPDATE pixels SET name = '.zattrs', path = 'demo_2/2017-08/16/02-26-49.136_mkngff/05334862-30d8-4a98-899f-2738a0dfc94d.zarr' where image in (select id from Image where fileset = 5287517);
UPDATE 2304
idr=> UPDATE pixels SET name = '.zattrs', path = 'demo_2/2016-06/25/17-09-16.470_mkngff/0e4290c9-52ba-418c-ae97-86e5e7a43439.zarr' where image in (select id from Image where fileset = 5287518);
UPDATE 2304
Checking first of those plates "24516" at http://localhost:1080/webclient/?show=image-3333350 Memo regenerating...
Memo file generation for that Plate doesn't seem to have completed. http://localhost:1040/webclient/?show=image-3333350 still not displaying...
Checking logs for that fileset: - don't see anything before just now... NB - Current build of OMEZarrReader is logging everything at ERROR, even if they aren't errors.
(base) [wmoore@pilot-idr0125-omeroreadwrite ~]$ grep "mkngff/05334862-30d8" /opt/omero/server/OMERO.server/var/log/Blitz-0.log
2023-09-19 15:20:13,493 INFO [ ome.services.OmeroFilePathResolver] (l.Server-3) Metadata only file, resulting path: /data/OMERO/ManagedRepository/demo_2/2017-08/16/02-26-49.136_mkngff/05334862-30d8-4a98-899f-2738a0dfc94d.zarr/.zattrs
2023-09-19 15:20:15,736 INFO [ loci.formats.ImageReader] (l.Server-3) ZarrReader initializing /data/OMERO/ManagedRepository/demo_2/2017-08/16/02-26-49.136_mkngff/05334862-30d8-4a98-899f-2738a0dfc94d.zarr/.zattrs
2023-09-19 15:20:16,621 ERROR [ loci.formats.FormatHandler] (l.Server-3) ZarrReader attempting to initialize file: /data/OMERO/ManagedRepository/demo_2/2017-08/16/02-26-49.136_mkngff/05334862-30d8-4a98-899f-2738a0dfc94d.zarr/.zattrs
2023-09-19 15:22:00,667 INFO [ ome.services.OmeroFilePathResolver] (l.Server-4) Metadata only file, resulting path: /data/OMERO/ManagedRepository/demo_2/2017-08/16/02-26-49.136_mkngff/05334862-30d8-4a98-899f-2738a0dfc94d.zarr/.zattrs
2023-09-19 15:22:00,677 INFO [ loci.formats.ImageReader] (l.Server-4) ZarrReader initializing /data/OMERO/ManagedRepository/demo_2/2017-08/16/02-26-49.136_mkngff/05334862-30d8-4a98-899f-2738a0dfc94d.zarr/.zattrs
2023-09-19 15:22:01,568 ERROR [ loci.formats.FormatHandler] (l.Server-4) ZarrReader attempting to initialize file: /data/OMERO/ManagedRepository/demo_2/2017-08/16/02-26-49.136_mkngff/05334862-30d8-4a98-899f-2738a0dfc94d.zarr/.zattrs
2023-09-19 15:31:35,110 INFO [ ome.services.util.ServiceHandler] (l.Server-9) Rslt: ([demo_2/2017-08/16/02-26-49.136_mkngff/05334862-30d8-4a98-899f-2738a0dfc94d.zarr/P/9/5/3/, .zarray, unknown], [demo_2/2017-08/16/02-26-49.136_mkngff/05334862-30d8-4a98-899f-2738a0dfc94d.zarr/P/9/5/, 3, unknown], [demo_2/2017-08/16/02-26-49.136_mkngff/05334862-30d8-4a98-899f-2738a0dfc94d.zarr/P/9/5/2/, .zarray, unknown], ... 26527 more)
Tried an Image from the 2nd Plate above to trigger memo file for that Plate: http://localhost:1040/webclient/?show=image-2500103 and checking for logs on that Fileset:
(base) [wmoore@pilot-idr0125-omeroreadwrite ~]$ grep "mkngff/0e4290c9-52ba" /opt/omero/server/OMERO.server/var/log/Blitz-0.log
2023-09-19 15:24:06,586 INFO [ ome.services.OmeroFilePathResolver] (l.Server-0) Metadata only file, resulting path: /data/OMERO/ManagedRepository/demo_2/2016-06/25/17-09-16.470_mkngff/0e4290c9-52ba-418c-ae97-86e5e7a43439.zarr/.zattrs
2023-09-19 15:24:07,774 INFO [ ome.services.OmeroFilePathResolver] (l.Server-2) Metadata only file, resulting path: /data/OMERO/ManagedRepository/demo_2/2016-06/25/17-09-16.470_mkngff/0e4290c9-52ba-418c-ae97-86e5e7a43439.zarr/.zattrs
2023-09-19 15:24:07,815 INFO [ loci.formats.ImageReader] (l.Server-0) ZarrReader initializing /data/OMERO/ManagedRepository/demo_2/2016-06/25/17-09-16.470_mkngff/0e4290c9-52ba-418c-ae97-86e5e7a43439.zarr/.zattrs
2023-09-19 15:24:07,815 INFO [ loci.formats.ImageReader] (l.Server-2) ZarrReader initializing /data/OMERO/ManagedRepository/demo_2/2016-06/25/17-09-16.470_mkngff/0e4290c9-52ba-418c-ae97-86e5e7a43439.zarr/.zattrs
2023-09-19 15:24:09,140 ERROR [ loci.formats.FormatHandler] (l.Server-0) ZarrReader attempting to initialize file: /data/OMERO/ManagedRepository/demo_2/2016-06/25/17-09-16.470_mkngff/0e4290c9-52ba-418c-ae97-86e5e7a43439.zarr/.zattrs
2023-09-19 15:24:09,704 ERROR [ loci.formats.FormatHandler] (l.Server-2) ZarrReader attempting to initialize file: /data/OMERO/ManagedRepository/demo_2/2016-06/25/17-09-16.470_mkngff/0e4290c9-52ba-418c-ae97-86e5e7a43439.zarr/.zattrs
2023-09-19 15:25:15,162 INFO [ ome.services.util.ServiceHandler] (l.Server-8) Rslt: ([demo_2/2016-06/25/17-09-16.470_mkngff/0e4290c9-52ba-418c-ae97-86e5e7a43439.zarr/P/9/5/3/, .zarray, unknown], [demo_2/2016-06/25/17-09-16.470_mkngff/0e4290c9-52ba-418c-ae97-86e5e7a43439.zarr/P/9/5/, 3, unknown], [demo_2/2016-06/25/17-09-16.470_mkngff/0e4290c9-52ba-418c-ae97-86e5e7a43439.zarr/P/9/5/2/, .zarray, unknown], ... 26527 more)
Memo file generation completed for 2nd plate above, viewing http://localhost:1040/webclient/?show=image-2500103. The ZarrReader used was yesteday's manual ERROR logging build (also updated on idr0125-pilot) https://github.com/ome/ZarrReader/pull/64#issuecomment-1725456254
(base) [wmoore@pilot-idr0125-omeroreadwrite ~]$ grep -A 2 "saved memo" /opt/omero/server/OMERO.server/var/log/Blitz-0.log | grep -A 2 "mkngff/0e4290c9-52ba"
2023-09-19 20:29:14,972 DEBUG [ loci.formats.Memoizer] (l.Server-2) saved memo file: /data/OMERO/BioFormatsCache/data/OMERO/ManagedRepository/demo_2/2016-06/25/17-09-16.470_mkngff/0e4290c9-52ba-418c-ae97-86e5e7a43439.zarr/..zattrs.bfmemo (3898753 bytes)
2023-09-19 20:29:14,972 DEBUG [ loci.formats.Memoizer] (l.Server-2) start[1695137047777] time[18307195] tag[loci.formats.Memoizer.setId]
2023-09-19 20:29:14,972 INFO [ ome.io.nio.PixelsService] (l.Server-2) Creating BfPixelBuffer: /data/OMERO/ManagedRepository/demo_2/2016-06/25/17-09-16.470_mkngff/0e4290c9-52ba-418c-ae97-86e5e7a43439.zarr/.zattrs Series: 0
--
2023-09-19 20:29:15,056 DEBUG [ loci.formats.Memoizer] (l.Server-0) saved memo file: /data/OMERO/BioFormatsCache/data/OMERO/ManagedRepository/demo_2/2016-06/25/17-09-16.470_mkngff/0e4290c9-52ba-418c-ae97-86e5e7a43439.zarr/..zattrs.bfmemo (3898753 bytes)
2023-09-19 20:29:15,056 DEBUG [ loci.formats.Memoizer] (l.Server-0) start[1695137046589] time[18308467] tag[loci.formats.Memoizer.setId]
2023-09-19 20:29:15,056 INFO [ ome.io.nio.PixelsService] (l.Server-0) Creating BfPixelBuffer: /data/OMERO/ManagedRepository/demo_2/2016-06/25/17-09-16.470_mkngff/0e4290c9-52ba-418c-ae97-86e5e7a43439.zarr/.zattrs Series: 0
18307195 ms is 5 hours
Started mkngff sql
&& psql commit
at 11:30 last night..
after ~10 hours done about 80 filesets (7.5 mins each) - Will take 51 hours to do all 413 filesets.
Server restart to remount goofys...
Restart omero mkngff sql
generation (now using https://github.com/IDR/omero-mkngff/pull/11/commits/a2d0aeeb5195e7374c7cb48e5d989d813a05f982 to skip sql if already done)
but this time, don't execute the sql (we don't want to re-run sql that's been run before).
Use same $SECRET as in existing sql, so they all have the same...
export SECRET=602d53b5-6120-4a07-8013-a81c16a5ee81
for r in $(cat $IDRID.csv); do
biapath=$(echo $r | cut -d',' -f2)
uuid=$(echo $biapath | cut -d'/' -f2)
fsid=$(echo $r | cut -d',' -f3)
omero mkngff sql --symlink_repo /data/OMERO/ManagedRepository --secret=$SECRET $fsid "/bia-integrator-data/$biapath/$uuid.zarr" >> "$IDRID/$fsid.sql"
done
Fileset 21281
first to be processed in this round. 96 Filesets done so far out of 413.
Restarted server to re-mount goofys again... Starting on Fileset 23584
...
It is becoming increasingly clear that the goofys
file system is struggling with the requirements of the current fileset swap operations.
omero mkngff
the only operation that is happening on this system when goofys
falls over?goofys
e.g. by enabling --debug_fuse
and/or --debug_s3
to have more information.@sbesson
omero mkngff sql
in more than 1 terminal/screen at a time. So, I'm going to stick with just 1 at a time now.zarray
dirs with https://github.com/IDR/omero-mkngff/pull/11 , but yes - still lots of recursive walking on PlatesCancelled mkngff sql
on idr-testing
just now as I realised there's a bug that omits .zarray
files.
Stopped after Fileset 21510
(233 / 413) in the idr0016.csv
Fixed in https://github.com/IDR/omero-mkngff/pull/11/commits/cac303d3c1bdab030ee286533b94fa744461d726
Updated...
(venv3) [root@test120-omeroreadwrite wmoore]# pip install git+https://github.com/will-moore/omero-mkngff.git@dont_walk_arrays
...
Resolved https://github.com/will-moore/omero-mkngff.git to commit 08db883c54410265783d5f5a4cf5f6b31d2dd5e3
Start from scratch on idr0138-pilot
as regular wmoore user...
wget https://raw.githubusercontent.com/IDR/idr-utils/ebbb0b9dc6ed548db9bbe298c062a14885411097/scripts/ngff_filesets/idr0016.csv
(venv3) (base) [wmoore@pilot-idr0138-omeroreadwrite ~]$ for r in $(cat $IDRID.csv); do
> biapath=$(echo $r | cut -d',' -f2)
> uuid=$(echo $biapath | cut -d'/' -f2)
> fsid=$(echo $r | cut -d',' -f3)
> omero mkngff sql $fsid "/bia-integrator-data/$biapath/$uuid.zarr" >> "$IDRID/$fsid.sql"
> done
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Found prefix: demo_2/2016-06/24/05-33-04.817 for fileset: 21405
...
The last successful sql
generated above was number 72/413:
idr0016/26595.ome.zarr,S-BIAD851/2632c5cd-86ec-434a-9da7-5277ab002250,21570
It seems that at that point this failed, and is current status
$ ls /bia-integrator-data
ls: cannot access /bia-integrator-data: Transport endpoint is not connected
Re-mounted goofys /bia-integrator-data
and restarted server...
Edited idr0016.csv
to remove all lines before idr0016/26595.ome.zarr,S-BIAD851/2632c5cd-86ec-434a-9da7-5277ab002250,21570
(maybe should have removed that line too?)...
And re-ran omero mkngff sql
as above...
restart, after 116 filesets processed since last restart...
(venv3) (base) [wmoore@pilot-idr0138-omeroreadwrite ~]$ for r in $(cat $IDRID.csv); do biapath=$(echo $r | cut -d',' -f2); uuid=$(echo $biapath | cut -d'/' -f2); fsid=$(echo $r | cut -d',' -f3); omero mkngff sql $fsid "/bia-integrator-data/$biapath/$uuid.zarr" > "$IDRID/$fsid.sql"; done
Using session for public@idr.openmicroscopy.org:4064. Idle timeout: 10 min. Current group: Public
Found prefix: demo_2/2016-06/25/06-09-19.476 for fileset: 21471
...
All done...
A couple of files are 0 bytes
:
idr0016/21453.sql
idr0016/21256.sql
Need to re-convert Plate named 24667
since previous NGFF conversion is missing some files from N10
field 1
:
https://ome.github.io/ome-ngff-validator/?source=https%3A%2F%2Fuk1s3.embassy.ebi.ac.uk%2Fbia-integrator-data%2FS-BIAD851%2F2c49b893-ec6d-4329-9cc3-569b820075f2%2F2c49b893-ec6d-4329-9cc3-569b820075f2.zarr&well=all and https://ome.github.io/ome-ngff-validator/?source=https://uk1s3.embassy.ebi.ac.uk/bia-integrator-data/S-BIAD851/2c49b893-ec6d-4329-9cc3-569b820075f2/2c49b893-ec6d-4329-9cc3-569b820075f2.zarr/N/10/
On zarr1-dev-pilot...
conda activate bioformats2raw2
~/bioformats2raw-0.6.0-24/bin/bioformats2raw --memo-directory ../memo /uod/idr/metadata/idr0016-wawer-bioactivecompoundprofiling/screens/24667.screen 24667.ome.zarr
EDIT - this failed! Forgot that we're using omero-cli-zarr for idr0016 exports...
conda activate omero-zarr-export
pip install -U git+https://github.com/will-moore/omero-cli-zarr.git@name_option
$ omero zarr export Plate:6202 --name_by name
Error loading: /home/wmoore/miniconda3/envs/omero_zarr_export/lib/python3.9/site-packages/omero/plugins/zarr.py
Traceback (most recent call last):
File "/home/wmoore/miniconda3/envs/omero_zarr_export/lib/python3.9/site-packages/omero/cli.py", line 1690, in loadpath
execfile(str(pathobj), loc)
File "/home/wmoore/miniconda3/envs/omero_zarr_export/lib/python3.9/site-packages/past/builtins/misc.py", line 87, in execfile
exec_(code, myglobals, mylocals)
File "/home/wmoore/miniconda3/envs/omero_zarr_export/lib/python3.9/site-packages/omero/plugins/zarr.py", line 1, in <module>
from omero_zarr.cli import HELP, ZarrControl
File "/home/wmoore/miniconda3/envs/omero_zarr_export/lib/python3.9/site-packages/omero_zarr/__init__.py", line 21, in <module>
from ._version import version as __version__
ModuleNotFoundError: No module named 'omero_zarr._version'
usage: /home/wmoore/miniconda3/envs/omero_zarr_export/bin/omero
[-h] [-v] [-d DEBUG] [--path PATH] [-C] [-s SERVER] [-p PORT]
[-g GROUP] [-u USER] [-w PASSWORD] [-k KEY] [--sudo ADMINUSER] [-q]
<subcommand> ...
/home/wmoore/miniconda3/envs/omero_zarr_export/bin/omero: error: argument <subcommand>: invalid choice: 'zarr'
Use idr-ftp
as above:
$ conda activate omero_zarr_export
(omero_zarr_export) [wmoore@idrftp-ftp idr0016]$ pip freeze | grep zarr
ome-zarr==0.8.0
omero-cli-zarr @ git+https://github.com/will-moore/omero-cli-zarr.git@c73d40046536f8b5cc62908ebdaa86d097a30d0b
zarr==2.16.1
omero zarr export Plate:6202 --name_by name
Check that plate isn't missing Well N/10/1
as above...
(omero_zarr_export) [wmoore@idrftp-ftp idr0016]$ ls -alh 24667.ome.zarr/N/10/
total 12K
drwxrwxr-x. 8 wmoore wmoore 126 Nov 15 16:25 .
drwxrwxr-x. 26 wmoore wmoore 4.0K Nov 15 16:27 ..
drwxrwxr-x. 6 wmoore wmoore 100 Nov 15 16:25 0
drwxrwxr-x. 6 wmoore wmoore 100 Nov 15 16:25 1
drwxrwxr-x. 6 wmoore wmoore 100 Nov 15 16:25 2
drwxrwxr-x. 6 wmoore wmoore 100 Nov 15 16:25 3
drwxrwxr-x. 6 wmoore wmoore 100 Nov 15 16:25 4
drwxrwxr-x. 6 wmoore wmoore 100 Nov 15 16:25 5
-rw-rw-r--. 1 wmoore wmoore 420 Nov 15 16:25 .zattrs
-rw-rw-r--. 1 wmoore wmoore 24 Nov 15 16:25 .zgroup
(omero_zarr_export) [wmoore@idrftp-ftp idr0016]$ ls -alh 24667.ome.zarr/N/10/1
total 12K
drwxrwxr-x. 6 wmoore wmoore 100 Nov 15 16:25 .
drwxrwxr-x. 8 wmoore wmoore 126 Nov 15 16:25 ..
drwxrwxr-x. 7 wmoore wmoore 94 Nov 15 16:25 0
drwxrwxr-x. 7 wmoore wmoore 94 Nov 15 16:25 1
drwxrwxr-x. 7 wmoore wmoore 94 Nov 15 16:25 2
drwxrwxr-x. 7 wmoore wmoore 94 Nov 15 16:25 3
-rw-rw-r--. 1 wmoore wmoore 4.5K Nov 15 16:25 .zattrs
-rw-rw-r--. 1 wmoore wmoore 24 Nov 15 16:25 .zgroup
$ zip -r 24667.ome.zarr.zip 24667.ome.zarr
Delete 24667.ome.zarr.zip
from https://www.ebi.ac.uk/biostudies/submissions/files?path=%2Fuser%2Fidr0016 and reupload...
sudo /root/.aspera/cli/bin/ascp -P33001 -i /root/.aspera/cli/etc/asperaweb_id_dsa.openssh -d /data/idr0016/idr0016/ bsaspera_w@hx-fasp-1.ebi.ac.uk:/5f/13xxxxxxx
24667.ome.zarr.zip 100% 5461MB 454Mb/s 01:34
Completed: 5592294K bytes transferred in 95 seconds
(480827K bits/sec), in 1 file, 1 directory.
idr0016 plates (Names) that are not yet viewable in idr-testing:
Lets run sql etc on clean idr0125-pilot data...
Update SECRET in sql... as wmoore
$ cd idr-util/scripts/ngff_filesets/idr0016
$ for i in $(ls); do sudo sed -i 's/SECRETUUID/c6b02bb7-2c22-4c45-be8d-30484c380a9c/g' $i; done
as omero-server user...
$ cd ngff_filesets/
$ export IDRID=idr0016
(venv3) (base) bash-4.2$ for r in $(cat $IDRID.csv); do
> biapath=$(echo $r | cut -d',' -f2)
> uuid=$(echo $biapath | cut -d'/' -f2)
> fsid=$(echo $r | cut -d',' -f3 | tr -d '[:space:]')
> psql -U omero -d idr -h $DBHOST -f "$IDRID/$fsid.sql"
> omero mkngff symlink /data/OMERO/ManagedRepository $fsid "/bia-integrator-data/$biapath/$uuid.zarr" --bfoptions
> done
UPDATE 2304
BEGIN
mkngff_fileset
----------------
5288754
(1 row)
COMMIT
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-06/24/05-33-04.817
Creating dir at /data/OMERO/ManagedRepository/demo_2/2016-06/24/05-33-04.817_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/2016-06/24/05-33-04.817_mkngff/000f81bf-a7b2-4610-99c3-47dc5fec8c92.zarr -> /bia-integrator-data/S-BIAD851/000f81bf-a7b2-4610-99c3-47dc5fec8c92/000f81bf-a7b2-4610-99c3-47dc5fec8c92.zarr
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-06/24/05-33-04.817
write bfoptions to: /data/OMERO/ManagedRepository/demo_2/2016-06/24/05-33-04.817_mkngff/000f81bf-a7b2-4610-99c3-47dc5fec8c92.zarr.bfoptions
UPDATE 2304
BEGIN
mkngff_fileset
----------------
5288755
(1 row)
COMMIT
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2017-08/17/02-13-40.469
Creating dir at /data/OMERO/ManagedRepository/demo_2/2017-08/17/02-13-40.469_mkngff
...
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-06/19/23-32-50.888
Creating dir at /data/OMERO/ManagedRepository/demo_2/2016-06/19/23-32-50.888_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/2016-06/19/23-32-50.888_mkngff/fd822d4b-3060-46e9-8178-982510009c93.zarr -> /bia-integrator-data/S-BIAD851/fd822d4b-3060-46e9-8178-982510009c93/fd822d4b-3060-46e9-8178-982510009c93.zarr
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-06/19/23-32-50.888
write bfoptions to: /data/OMERO/ManagedRepository/demo_2/2016-06/19/23-32-50.888_mkngff/fd822d4b-3060-46e9-8178-982510009c93.zarr.bfoptions
UPDATE 2304
BEGIN
mkngff_fileset
----------------
5289164
(1 row)
COMMIT
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-06/27/02-33-37.895
Creating dir at /data/OMERO/ManagedRepository/demo_2/2016-06/27/02-33-37.895_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/2016-06/27/02-33-37.895_mkngff/fdf51c56-0ecf-4e1c-8b47-c35cafd78a2c.zarr -> /bia-integrator-data/S-BIAD851/fdf51c56-0ecf-4e1c-8b47-c35cafd78a2c/fdf51c56-0ecf-4e1c-8b47-c35cafd78a2c.zarr
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-06/27/02-33-37.895
write bfoptions to: /data/OMERO/ManagedRepository/demo_2/2016-06/27/02-33-37.895_mkngff/fdf51c56-0ecf-4e1c-8b47-c35cafd78a2c.zarr.bfoptions
UPDATE 2304
BEGIN
mkngff_fileset
----------------
5289165
(1 row)
COMMIT
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-06/20/22-41-20.985
Creating dir at /data/OMERO/ManagedRepository/demo_2/2016-06/20/22-41-20.985_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/2016-06/20/22-41-20.985_mkngff/feea9b2d-dd05-428a-a04e-5ebd45048401.zarr -> /bia-integrator-data/S-BIAD851/feea9b2d-dd05-428a-a04e-5ebd45048401/feea9b2d-dd05-428a-a04e-5ebd45048401.zarr
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-06/20/22-41-20.985
write bfoptions to: /data/OMERO/ManagedRepository/demo_2/2016-06/20/22-41-20.985_mkngff/feea9b2d-dd05-428a-a04e-5ebd45048401.zarr.bfoptions
UPDATE 2304
BEGIN
mkngff_fileset
----------------
5289166
(1 row)
COMMIT
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-06/27/12-57-28.592
Creating dir at /data/OMERO/ManagedRepository/demo_2/2016-06/27/12-57-28.592_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/2016-06/27/12-57-28.592_mkngff/ff85e5f2-258a-46ad-bdd0-d4f296aec28e.zarr -> /bia-integrator-data/S-BIAD851/ff85e5f2-258a-46ad-bdd0-d4f296aec28e/ff85e5f2-258a-46ad-bdd0-d4f296aec28e.zarr
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-06/27/12-57-28.592
write bfoptions to: /data/OMERO/ManagedRepository/demo_2/2016-06/27/12-57-28.592_mkngff/ff85e5f2-258a-46ad-bdd0-d4f296aec28e.zarr.bfoptions
Viewing images (first 2 viewed, waiting...) http://localhost:1040/webclient/?show=image-2330212 http://localhost:1040/webclient/?show=image-2340843 This one got an error http://localhost:1040/webclient/?show=image-2376573
serverExceptionClass = ome.conditions.ResourceError
message = Error instantiating pixel buffer: /data/OMERO/ManagedRepository/demo_2/2016-06/21/01-46-55.560_mkngff/5aec8bec-8573-44ec-9e9e-24fb81623fbe.zarr/B/1/.zattrs
Done but not tried viewing yet http://localhost:1040/webclient/?show=image-2486279 http://localhost:1040/webclient/?show=image-2131185
not done yet http://localhost:1040/webclient/?show=image-2435591
Looking at the last Fileset generated above 5289166
, Find Image ID via psql...
Fileset doesn't have clientpath set:
last row of idr0016.csv:
idr0016/26110.ome.zarr,S-BIAD851/ff85e5f2-258a-46ad-bdd0-d4f296aec28e,21526
looking at 21526.sql
...
UPDATE pixels SET name = '.zattrs', path = 'demo_2/2016-06/27/12-57-28.592_mkngff/ff85e5f2-258a-46ad-bdd0-d4f296aec28e.zarr' where image in (select id from Image where fileset = 21526);
begin;
select mkngff_fileset(
21526,
'c6b02bb7-2c22-4c45-be8d-30484c380a9c',
'cdf35825-def1-4580-8d0b-9c349b8f78d6',
'demo_2/2016-06/27/12-57-28.592_mkngff/',
array[
['demo_2/2016-06/27/12-57-28.592_mkngff/ff85e5f2-258a-46ad-bdd0-d4f296aec28e.zarr/', '.zattrs', 'application/octet-stream', 'https://uk1s3.embassy.ebi.ac.uk/bia-integrator-data/S-BIAD851/ff85e5f2-258a-46ad-bdd0-d4f296aec28e/ff85e5f2-258a-46ad-bdd0-d4f296aec28e.zarr/.zattrs'],
['demo_2/2016-06/27/12-57-28.592_mkngff/ff85e5f2-258a-46ad-bdd0-d4f296aec28e.zarr/', '.zgroup', 'application/octet-stream', 'https://uk1s3.embassy.ebi.ac.uk/bia-integrator-data/S-BIAD851/ff85e5f2-258a-46ad-bdd0-d4f296aec28e/ff85e5f2-258a-46ad-bdd0-d4f296aec28e.zarr/.zgroup'],
...
Ah!!! - I forgot to update and run setup.sql
which creates the mkngff_fileset()
sql function!
https://github.com/IDR/idr0016-wawer-bioactivecompoundprofiling
Sample plate conversion failed with:
This error
Character reference "�" is an invalid XML character
is already referenced by https://github.com/IDR/bioformats/issues/29 .