Closed brettkoonce closed 3 years ago
@BradLarson basically, the training loop code requires var (not let) models to actually modify things --> when the save callback gets called we get a memory access conflict, eg:
Simultaneous accesses to 0x5581af6921e0, but modification requires exclusive access. Previous access (a modification) started at ResNet50-ImageNet`<unavailable> + 14202123 (0x5581aebaa50b). Current access (a read) started at: 0 libswiftCore.so 0x00007efe35e9a980 swift_beginAccess + 479 1 ResNet50-ImageNet 0x00005581aebab525 <unavailable> + 14206245 2 ResNet50-ImageNet 0x00005581aebac738 <unavailable> + 14210872 3 ResNet50-ImageNet 0x00005581aebac5e2 <unavailable> + 14210530 4 ResNet50-ImageNet 0x00005581aebac784 <unavailable> + 14210948 5 ResNet50-ImageNet 0x00005581aee04a47 <unavailable> + 16669255 6 ResNet50-ImageNet 0x00005581aee0f598 <unavailable> + 16713112 7 ResNet50-ImageNet 0x00005581aee04240 <unavailable> + 16667200 8 ResNet50-ImageNet 0x00005581aee06b56 <unavailable> + 16677718 9 ResNet50-ImageNet 0x00005581aebaa656 <unavailable> + 14202454 10 libc.so.6 0x00007efe17c77ab0 __libc_start_main + 231 11 ResNet50-ImageNet 0x00005581adecfaba <unavailable> + 723642 Fatal access conflict detected. Aborted (core dumped)
Is there a lazy trick (eg shadow copy/mutex of some form) to deal with this, or am I using the API incorrectly/there a better place to deal with this?
dispatch queues ftw
@BradLarson basically, the training loop code requires var (not let) models to actually modify things --> when the save callback gets called we get a memory access conflict, eg:
Is there a lazy trick (eg shadow copy/mutex of some form) to deal with this, or am I using the API incorrectly/there a better place to deal with this?