opencadc / vostools

VOSpace command line and FUSE clients
https://www.canfar.net/en/docs/storage
12 stars 18 forks source link

unhandled exceptions when opening a file for read/write/append #41

Open NormanHill opened 10 years ago

NormanHill commented 10 years ago

It seems likely that the meta data, and in particular the bitmap size no longer matches the data file in the cache. I don't believe this is possible in the log shown, but it may have happened earlier.

This may be possible - the allocation of a new CacheMetaData object in flushNode, setHeader, and open are not protected by the same mechanism. It is possible for a race condition to allow the meta data to be set incorrectly.

The conditions which would allo this to happen would be many threads simultaneously opening the same file, appending to it and closing the file.

A sample of the cache data and meta data file are available for further investigation.

There may be other race conditions associated with opening files while they are being flushed, written, or truncated.

Also, there seems to be a bug near line 208 of CadcCache.py:

                    fileHandle.fileSize = os.path.getsize(
                            fileHandle.cacheMetaDataFile)

I believe the size should be set to the size of data file, not the size of the meta data file.

Following is the output from one vofs mount with a CASA application running:

2014-09-04 12:52:28,684 47838481070400 vos-1.10.4fuse.call.845 -> open /test.src.split.ms.contsub.chan/HISTORY/table.f0 (32770,) 2014-09-04 12:52:28,684 47838481070400 vos-1.10.4vofs.open.462 Opening /test.src.split.ms.contsub.chan/HISTORY/table.f0 with flags a+ 2014-09-04 12:52:28,685 47838481070400 vos-1.10.4CadcCache.getAttr.520 gettattr /test.src.split.ms.contsub.chan/HISTORY/table.f0: 2014-09-04 12:52:28,685 47838481070400 vos-1.10.4vofs.getNode.390 force? -> False path -> /test.src.split.ms.contsub.chan/HISTORY/table.f0 2014-09-04 12:52:28,685 47838481070400 vos-1.10.4vofs.getNode.394 requesting node /test.src.split.ms.contsub.chan/HISTORY/table.f0 from VOSpace 2014-09-04 12:52:28,685 47838481070400 vos-1.10.4vos.getNode.1095 Getting node /test.src.split.ms.contsub.chan/HISTORY/table.f0 2014-09-04 12:52:28,685 47838481070400 vos-1.10.4vofs.getNode.390 force? -> False path -> /test.src.split.ms.contsub.chan/HISTORY 2014-09-04 12:52:28,685 47838481070400 vos-1.10.4vofs.getNode.394 requesting node /test.src.split.ms.contsub.chan/HISTORY from VOSpace 2014-09-04 12:52:28,685 47838481070400 vos-1.10.4vos.getNode.1095 Getting node /test.src.split.ms.contsub.chan/HISTORY 2014-09-04 12:52:28,685 47838481070400 vos-1.10.4CadcCache.init.732 creating a new File Handle for /test.src.split.ms.contsub.chan/HISTORY/table.f0 2014-09-04 12:52:28,686 47838481070400 vos-1.10.4CadcCache.open.192 Opening file /test.src.split.ms.contsub.chan/HISTORY/table.f0: isnew False: id 247351696 2014-09-04 12:52:28,686 47838481070400 vos-1.10.4CadcCache.readData.1174 Getting data: 0 0 None 2014-09-04 12:52:28,686 47838639360320 vos-1.10.4vofs.readFromBacking.79 reading range: None 2014-09-04 12:52:28,686 47838639360320 vos-1.10.4vofs.readFromBacking.83 Opening a new vo file on /test.src.split.ms.contsub.chan/HISTORY/table.f0 2014-09-04 12:52:28,687 47838639360320 vos-1.10.4vos.getNode.1095 Getting node /test.src.split.ms.contsub.chan/HISTORY/table.f0 2014-09-04 12:52:28,687 47838639360320 vos-1.10.4vos.getNodeURL.1142 Getting URL for: vos://cadc.nrc.ca!vospace/dcolombo/SextansA_test/test.src.split.ms.contsub.chan/HISTORY/table.f0 2014-09-04 12:52:28,687 47838639360320 vos-1.10.4vos.getNodeURL.1152 do_shortcut=1 method=GET view=data 2014-09-04 12:52:28,687 47838639360320 vos-1.10.4vos.getNodeURL.1167 parts: [scheme: vos, netloc: cadc.nrc.ca!vospace, path: /dcolombo/SextansA_test/test.src.split.ms.contsub.chan/HISTORY/table.f0] 2014-09-04 12:52:28,687 47838639360320 vos-1.10.4vos.getNodeURL.1196 URL: https://www.canfar.phys.uvic.ca 2014-09-04 12:52:28,687 47838639360320 vos-1.10.4vos.get_connection.102 Trying to connect to https://www.canfar.phys.uvic.ca using /home/dcolombo/.ssl/cadcproxy.pem 2014-09-04 12:52:28,687 47838639360320 vos-1.10.4vos.get_connection.124 Opening the connection 2014-09-04 12:52:28,850 47838639360320 vos-1.10.4vos.getNodeURL.1231 Sending short cut url: http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/transfer/bmVpZ2g9Q0FEQyZzaWdleHA9MjAxNDA5MDQyMzUzMDQmYXJjPVZPU3BhYyZpZD0yMTg3MDM5MTcmdXNlcj1kY29sb21ibyZucz1kY29sb21ibyUyRlNleHRhbnNBX3Rlc3QlMkZ0ZXN0LnNyYy5zcGxpdC5tcy5jb250c3ViLmNoYW4lMkZISVNUT1JZJmF1dGhDb2RlPTAmcnVuSUQ9YTg0a2IyYjZwdHdyZDN2ZyZ1cmw9aHR0cCUzQSUyRiUyRnN0b3I2LTA4JTNBODg4OCUyRmFyY2hpdmUlMkY4MWI3ZjExOSUyRnZvc3BhYyUyRmRlZmF1bHQlMkZkZWZhdWx0XzA2NjglMkYwMCUyRjAwJTJGMTglMkYyMTg3MDM5MTclMjUzQiUyNTNCMjE2ODk3NTkwJnVybD1odHRwJTNBJTJGJTJGc3RvcjYtMDclM0E4ODg4JTJGYXJjaGl2ZSUyRjkyYTRhNDg5JTJGdm9zcGFjJTJGZGVmYXVsdCUyRmRlZmF1bHRfMDY2OCUyRjAwJTJGMDAlMkYxOCUyRjIxODcwMzkxNyUyNTNCJTI1M0IyMTY4OTc1OTAmc2VydmljZVVybD1odHRwJTNBJTJGJTJGd3d3LmNhbmZhci5waHlzLnV2aWMuY2ElMkZ2b3NwYWNlJTJGc3luY3RyYW5zJTNGVEFSR0VUJTNEdm9zJTI1M0ElMjUyRiUyNTJGY2FkYy5ucmMuY2ElMjUyMXZvc3BhY2UlMjUyRmRjb2xvbWJvJTI1MkZTZXh0YW5zQV90ZXN0JTI1MkZ0ZXN0LnNyYy5zcGxpdC5tcy5jb250c3ViLmNoYW4lMjUyRkhJU1RPUlklMjUyRnRhYmxlLmYwJTI2RElSRUNUSU9OJTNEcHVsbEZyb21Wb1NwYWNlJTI2UFJPVE9DT0wlM0Rpdm8lMjUzQSUyNTJGJTI1MkZpdm9hLm5ldCUyNTJGdm9zcGFjZSUyNTJGY29yZSUyNTIzaHR0cGdldCUyNnZpZXclM0RkYXRh-ZmlkPXRhYmxlLmYwJmNsbj0xMDg3NDg4JmNtZD05ZjUxOTU1MmE5YjVhMmQyMTA4MjM2YTRmOTJiZTk5NiZsbWQ9MTQwODQ3NjMwNDY4Nw==-3a9a0d07f3afca1d0bc0ef6ca7e3860f/table.f0 2014-09-04 12:52:28,851 47838639360320 vos-1.10.4vos.open.750 Opening http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/transfer/bmVpZ2g9Q0FEQyZzaWdleHA9MjAxNDA5MDQyMzUzMDQmYXJjPVZPU3BhYyZpZD0yMTg3MDM5MTcmdXNlcj1kY29sb21ibyZucz1kY29sb21ibyUyRlNleHRhbnNBX3Rlc3QlMkZ0ZXN0LnNyYy5zcGxpdC5tcy5jb250c3ViLmNoYW4lMkZISVNUT1JZJmF1dGhDb2RlPTAmcnVuSUQ9YTg0a2IyYjZwdHdyZDN2ZyZ1cmw9aHR0cCUzQSUyRiUyRnN0b3I2LTA4JTNBODg4OCUyRmFyY2hpdmUlMkY4MWI3ZjExOSUyRnZvc3BhYyUyRmRlZmF1bHQlMkZkZWZhdWx0XzA2NjglMkYwMCUyRjAwJTJGMTglMkYyMTg3MDM5MTclMjUzQiUyNTNCMjE2ODk3NTkwJnVybD1odHRwJTNBJTJGJTJGc3RvcjYtMDclM0E4ODg4JTJGYXJjaGl2ZSUyRjkyYTRhNDg5JTJGdm9zcGFjJTJGZGVmYXVsdCUyRmRlZmF1bHRfMDY2OCUyRjAwJTJGMDAlMkYxOCUyRjIxODcwMzkxNyUyNTNCJTI1M0IyMTY4OTc1OTAmc2VydmljZVVybD1odHRwJTNBJTJGJTJGd3d3LmNhbmZhci5waHlzLnV2aWMuY2ElMkZ2b3NwYWNlJTJGc3luY3RyYW5zJTNGVEFSR0VUJTNEdm9zJTI1M0ElMjUyRiUyNTJGY2FkYy5ucmMuY2ElMjUyMXZvc3BhY2UlMjUyRmRjb2xvbWJvJTI1MkZTZXh0YW5zQV90ZXN0JTI1MkZ0ZXN0LnNyYy5zcGxpdC5tcy5jb250c3ViLmNoYW4lMjUyRkhJU1RPUlklMjUyRnRhYmxlLmYwJTI2RElSRUNUSU9OJTNEcHVsbEZyb21Wb1NwYWNlJTI2UFJPVE9DT0wlM0Rpdm8lMjUzQSUyNTJGJTI1MkZpdm9hLm5ldCUyNTJGdm9zcGFjZSUyNTJGY29yZSUyNTIzaHR0cGdldCUyNnZpZXclM0RkYXRh-ZmlkPXRhYmxlLmYwJmNsbj0xMDg3NDg4JmNtZD05ZjUxOTU1MmE5YjVhMmQyMTA4MjM2YTRmOTJiZTk5NiZsbWQ9MTQwODQ3NjMwNDY4Nw==-3a9a0d07f3afca1d0bc0ef6ca7e3860f/table.f0 (GET) 2014-09-04 12:52:28,851 47838639360320 vos-1.10.4vos.get_connection.102 Trying to connect to http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca using /home/dcolombo/.ssl/cadcproxy.pem 2014-09-04 12:52:28,851 47838639360320 vos-1.10.4vos.get_connection.124 Opening the connection 2014-09-04 12:52:28,855 47838639360320 vos-1.10.4vofs.readFromBacking.95 reading from http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/transfer/bmVpZ2g9Q0FEQyZzaWdleHA9MjAxNDA5MDQyMzUzMDQmYXJjPVZPU3BhYyZpZD0yMTg3MDM5MTcmdXNlcj1kY29sb21ibyZucz1kY29sb21ibyUyRlNleHRhbnNBX3Rlc3QlMkZ0ZXN0LnNyYy5zcGxpdC5tcy5jb250c3ViLmNoYW4lMkZISVNUT1JZJmF1dGhDb2RlPTAmcnVuSUQ9YTg0a2IyYjZwdHdyZDN2ZyZ1cmw9aHR0cCUzQSUyRiUyRnN0b3I2LTA4JTNBODg4OCUyRmFyY2hpdmUlMkY4MWI3ZjExOSUyRnZvc3BhYyUyRmRlZmF1bHQlMkZkZWZhdWx0XzA2NjglMkYwMCUyRjAwJTJGMTglMkYyMTg3MDM5MTclMjUzQiUyNTNCMjE2ODk3NTkwJnVybD1odHRwJTNBJTJGJTJGc3RvcjYtMDclM0E4ODg4JTJGYXJjaGl2ZSUyRjkyYTRhNDg5JTJGdm9zcGFjJTJGZGVmYXVsdCUyRmRlZmF1bHRfMDY2OCUyRjAwJTJGMDAlMkYxOCUyRjIxODcwMzkxNyUyNTNCJTI1M0IyMTY4OTc1OTAmc2VydmljZVVybD1odHRwJTNBJTJGJTJGd3d3LmNhbmZhci5waHlzLnV2aWMuY2ElMkZ2b3NwYWNlJTJGc3luY3RyYW5zJTNGVEFSR0VUJTNEdm9zJTI1M0ElMjUyRiUyNTJGY2FkYy5ucmMuY2ElMjUyMXZvc3BhY2UlMjUyRmRjb2xvbWJvJTI1MkZTZXh0YW5zQV90ZXN0JTI1MkZ0ZXN0LnNyYy5zcGxpdC5tcy5jb250c3ViLmNoYW4lMjUyRkhJU1RPUlklMjUyRnRhYmxlLmYwJTI2RElSRUNUSU9OJTNEcHVsbEZyb21Wb1NwYWNlJTI2UFJPVE9DT0wlM0Rpdm8lMjUzQSUyNTJGJTI1MkZpdm9hLm5ldCUyNTJGdm9zcGFjZSUyNTJGY29yZSUyNTIzaHR0cGdldCUyNnZpZXclM0RkYXRh-ZmlkPXRhYmxlLmYwJmNsbj0xMDg3NDg4JmNtZD05ZjUxOTU1MmE5YjVhMmQyMTA4MjM2YTRmOTJiZTk5NiZsbWQ9MTQwODQ3NjMwNDY4Nw==-3a9a0d07f3afca1d0bc0ef6ca7e3860f/table.f0 2014-09-04 12:52:28,894 47838639360320 vos-1.10.4vos.checkstatus.727 status 200 for URL http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/transfer/bmVpZ2g9Q0FEQyZzaWdleHA9MjAxNDA5MDQyMzUzMDQmYXJjPVZPU3BhYyZpZD0yMTg3MDM5MTcmdXNlcj1kY29sb21ibyZucz1kY29sb21ibyUyRlNleHRhbnNBX3Rlc3QlMkZ0ZXN0LnNyYy5zcGxpdC5tcy5jb250c3ViLmNoYW4lMkZISVNUT1JZJmF1dGhDb2RlPTAmcnVuSUQ9YTg0a2IyYjZwdHdyZDN2ZyZ1cmw9aHR0cCUzQSUyRiUyRnN0b3I2LTA4JTNBODg4OCUyRmFyY2hpdmUlMkY4MWI3ZjExOSUyRnZvc3BhYyUyRmRlZmF1bHQlMkZkZWZhdWx0XzA2NjglMkYwMCUyRjAwJTJGMTglMkYyMTg3MDM5MTclMjUzQiUyNTNCMjE2ODk3NTkwJnVybD1odHRwJTNBJTJGJTJGc3RvcjYtMDclM0E4ODg4JTJGYXJjaGl2ZSUyRjkyYTRhNDg5JTJGdm9zcGFjJTJGZGVmYXVsdCUyRmRlZmF1bHRfMDY2OCUyRjAwJTJGMDAlMkYxOCUyRjIxODcwMzkxNyUyNTNCJTI1M0IyMTY4OTc1OTAmc2VydmljZVVybD1odHRwJTNBJTJGJTJGd3d3LmNhbmZhci5waHlzLnV2aWMuY2ElMkZ2b3NwYWNlJTJGc3luY3RyYW5zJTNGVEFSR0VUJTNEdm9zJTI1M0ElMjUyRiUyNTJGY2FkYy5ucmMuY2ElMjUyMXZvc3BhY2UlMjUyRmRjb2xvbWJvJTI1MkZTZXh0YW5zQV90ZXN0JTI1MkZ0ZXN0LnNyYy5zcGxpdC5tcy5jb250c3ViLmNoYW4lMjUyRkhJU1RPUlklMjUyRnRhYmxlLmYwJTI2RElSRUNUSU9OJTNEcHVsbEZyb21Wb1NwYWNlJTI2UFJPVE9DT0wlM0Rpdm8lMjUzQSUyNTJGJTI1MkZpdm9hLm5ldCUyNTJGdm9zcGFjZSUyNTJGY29yZSUyNTIzaHR0cGdldCUyNnZpZXclM0RkYXRh-ZmlkPXRhYmxlLmYwJmNsbj0xMDg3NDg4JmNtZD05ZjUxOTU1MmE5YjVhMmQyMTA4MjM2YTRmOTJiZTk5NiZsbWQ9MTQwODQ3NjMwNDY4Nw==-3a9a0d07f3afca1d0bc0ef6ca7e3860f/table.f0 2014-09-04 12:52:28,894 47838639360320 vos-1.10.4CadcCache.setHeader.779 size: 1087488 md5: 9f519552a9b5a2d2108236a4f92be996 2014-09-04 12:52:28,894 47838639360320 vos-1.10.4vofs.readFromBacking.121 Writing: 16384 bytes at 0 to cache for /test.src.split.ms.contsub.chan/HISTORY/table.f0 2014-09-04 12:52:28,894 47838639360320 vos-1.10.4CadcCache.writeToCache.702 reading to cache aborted for /test.src.split.ms.contsub.chan/HISTORY/table.f0 2014-09-04 12:52:28,895 47838639360320 vos-1.10.4vofs.readFromBacking.137 Wrote: 0 bytes to cache for /test.src.split.ms.contsub.chan/HISTORY/table.f0 2014-09-04 12:52:28,895 47838639360320 vos-1.10.4CadcCache.execute.1237 setFullyCached? True 0 None 1087488 2014-09-04 12:52:28,895 47838639360320 vos-1.10.4CadcCache.execute.1267 read thread finished 2014-09-04 12:52:28,899 47838481070400 vos-1.10.4CadcCache.setHeader.779 size: 1087488 md5: 9f519552a9b5a2d2108236a4f92be996 2014-09-04 12:52:28,899 47838481070400 vos-1.10.4fuse.call.854 <- open 247350480 2014-09-04 12:52:28,899 47838618380608 vos-1.10.4fuse.call.845 -> read /test.src.split.ms.contsub.chan/HISTORY/table.f0 (16384L, 0, 247350480L) 2014-09-04 12:52:28,899 47838618380608 vos-1.10.4vofs.read.560 reading range: /test.src.split.ms.contsub.chan/HISTORY/table.f0 16384 0 247350480 2014-09-04 12:52:28,900 47838618380608 vos-1.10.4CadcCache.read.1022 reading 16384 bytes at 0 2014-09-04 12:52:28,900 47838618380608 vos-1.10.4fuse.call.854 <- read <ctypes.c_char_Array_16384 object at 0xea5d200> 2014-09-04 12:52:28,900 47838491560256 vos-1.10.4fuse.call.845 -> getattr /test.src.split.ms.contsub.chan/HISTORY/table.f0 (None,) 2014-09-04 12:52:28,900 47838491560256 vos-1.10.4vofs.getattr.413 getting attributes of /test.src.split.ms.contsub.chan/HISTORY/table.f0 2014-09-04 12:52:28,900 47838491560256 vos-1.10.4CadcCache.getAttr.520 gettattr /test.src.split.ms.contsub.chan/HISTORY/table.f0: 2014-09-04 12:52:28,900 47838491560256 vos-1.10.4vofs.getNode.390 force? -> False path -> /test.src.split.ms.contsub.chan/HISTORY/table.f0 2014-09-04 12:52:28,900 47838491560256 vos-1.10.4vofs.getNode.394 requesting node /test.src.split.ms.contsub.chan/HISTORY/table.f0 from VOSpace 2014-09-04 12:52:28,901 47838491560256 vos-1.10.4vos.getNode.1095 Getting node /test.src.split.ms.contsub.chan/HISTORY/table.f0 2014-09-04 12:52:28,901 47838491560256 vos-1.10.4fuse.call.854 <- getattr {'st_ctime': 1408472704.0, 'st_mtime': 1408472704.0, 'st_blocks': 2124, 'st_gid': 500, 'st_nlink': 1, 'st_mode': 33221, 'st_size': 1087488, 'st_atime': 1409860276.4991741, 'st_uid': 500} 2014-09-04 12:52:28,901 47838481070400 vos-1.10.4fuse.call.845 -> read /test.src.split.ms.contsub.chan/HISTORY/table.f0 (8192L, 1081344, 247350480L) 2014-09-04 12:52:28,901 47838481070400 vos-1.10.4vofs.read.560 reading range: /test.src.split.ms.contsub.chan/HISTORY/table.f0 8192 1081344 247350480 2014-09-04 12:52:28,901 47838481070400 vos-1.10.4CadcCache.read.1022 reading 8192 bytes at 1081344 2014-09-04 12:52:28,901 47838481070400 vos-1.10.4fuse.call.854 <- read <ctypes.c_char_Array_6144 object at 0xea5d050> 2014-09-04 12:52:28,902 47838618380608 vos-1.10.4fuse.call.845 -> write /test.src.split.ms.contsub.chan/HISTORY/table.f0 (<vos.fuse.LP_c_byte object at 0xe746a70>, 2816L, 1081856, 247350480L) 2014-09-04 12:52:28,902 47838618380608 vos-1.10.4vofs.write.772 /test.src.split.ms.contsub.chan/HISTORY/table.f0 -> <vos.vofs.HandleWrapper object at 0xebe44d0> 2014-09-04 12:52:28,902 47838618380608 vos-1.10.4vofs.write.773 1081856 --> 1084672 2014-09-04 12:52:28,902 47838618380608 vos-1.10.4CadcCache.write.972 writting 2816 bytes at 1081856 to 8 2014-09-04 12:52:28,902 47838618380608 vos-1.10.4logExceptions.wrapper.15 Exception throw: <type 'exceptions.ValueError'> index range error 2014-09-04 12:52:28,903 47838618380608 vos-1.10.4logExceptions.wrapper.16 Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/vos-1.10.4-py2.6.egg/vos/logExceptions.py", line 13, in wrapper return func(_args, *_kwds) File "/usr/lib/python2.6/site-packages/vos-1.10.4-py2.6.egg/vos/vofs.py", line 776, in write return fh.cacheFileHandle.write(data, size, offset) File "/usr/lib/python2.6/site-packages/vos-1.10.4-py2.6.egg/vos/CadcCache.py", line 995, in write self.makeCached(offset, self.fileSize) File "/usr/lib/python2.6/site-packages/vos-1.10.4-py2.6.egg/vos/CadcCache.py", line 1065, in makeCached requiredRange = self.metaData.getRange(firstBlock, lastBlock) File "/usr/lib/python2.6/site-packages/vos-1.10.4-py2.6.egg/vos/CacheMetaData.py", line 100, in getRange if self.bitmap[i] == 0: File "/usr/lib/python2.6/site-packages/vos-1.10.4-py2.6.egg/vos/BitVector.py", line 1466, in _getbit raise ValueError( "index range error" ) ValueError: index range error

ijiraq commented 7 years ago

This might have been resolve. A test case would should be built and run.

ijiraq commented 7 years ago

A test case would should be built and run. (closed by mistake)