Closed pford closed 1 year ago
@kswang1029 macOS crash has been fixed. Please let me know if there are any remaining issues.
@pford It happened to me few times that when I drag the pv cut around, suddenly the preview pv image stops updating and hangs there. Will try to see if I can find a robust way to reproduce it.
UPDATE: once I wait for a while, the log shows
[libprotobuf ERROR google/protobuf/message_lite.cc:480] CARTA.PvPreviewData exceeded maximum protobuf size of 2GB: 36254064052
and the debug message like
[2023-04-15 01:41:37.913] [CARTA] [debug] File 0 region 2 line profile 2248 of 2249 max num pixels=0
shows very large numbers!
UPDATE2: Got it again
[2023-04-15 01:48:25.222] [CARTA] [debug] PV preview profile 517 of 525 max num pixels=9
[2023-04-15 01:48:25.222] [CARTA] [debug] PV preview profile 518 of 525 max num pixels=9
[2023-04-15 01:48:25.223] [CARTA] [debug] PV preview profile 519 of 525 max num pixels=9
[2023-04-15 01:48:25.223] [CARTA] [debug] PV preview profile 520 of 525 max num pixels=9
[2023-04-15 01:48:25.224] [CARTA] [debug] PV preview profile 521 of 525 max num pixels=9
[2023-04-15 01:48:25.224] [CARTA] [debug] PV preview profile 522 of 525 max num pixels=9
[2023-04-15 01:48:25.224] [CARTA] [debug] PV preview profile 523 of 525 max num pixels=6
[2023-04-15 01:48:25.225] [CARTA] [debug] PV preview profile 524 of 525 max num pixels=3
[2023-04-15 01:48:25.228] [CARTA] [debug] Updating pv preview 0 for region 2
[2023-04-15 01:48:25.229] [CARTA] [debug] Fixed pixel offsets not linear
[2023-04-15 01:48:25.229] [CARTA] [debug] Fixed pixel offsets not linear
[2023-04-15 01:48:25.276] [CARTA] [debug] Using fixed angular increment for line profiles.
[2023-04-15 01:48:25.276] [CARTA] [debug] Cancel line profiles: region/file closed or changed
[2023-04-15 01:48:25.277] [CARTA] [info] Line region 2 spatial profile was cancelled.
[2023-04-15 01:48:27.273] [CARTA] [debug] Using fixed angular increment for line profiles.
[2023-04-15 01:49:50.118] [CARTA] [debug] PV preview profile 12658322 of 25317170 max num pixels=10
[2023-04-15 01:49:50.137] [CARTA] [debug] PV preview profile 12658323 of 25317170 max num pixels=10
[2023-04-15 01:49:50.143] [CARTA] [debug] PV preview profile 12658324 of 25317170 max num pixels=9
[2023-04-15 01:49:50.148] [CARTA] [debug] PV preview profile 12658325 of 25317170 max num pixels=9
[2023-04-15 01:49:50.152] [CARTA] [debug] PV preview profile 12658326 of 25317170 max num pixels=9
[2023-04-15 01:49:50.152] [CARTA] [debug] PV preview profile 12658327 of 25317170 max num pixels=8
[2023-04-15 01:49:50.155] [CARTA] [debug] PV preview profile 12658328 of 25317170 max num pixels=9
[2023-04-15 01:49:50.160] [CARTA] [debug] PV preview profile 12658329 of 25317170 max num pixels=8
[2023-04-15 01:49:50.177] [CARTA] [debug] PV preview profile 12658330 of 25317170 max num pixels=9
[2023-04-15 01:49:50.254] [CARTA] [debug] PV preview profile 12658331 of 25317170 max num pixels=9
[2023-04-15 01:49:50.324] [CARTA] [debug] PV preview profile 12658332 of 25317170 max num pixels=10
[2023-04-15 01:49:50.329] [CARTA] [debug] PV preview profile 12658333 of 25317170 max num pixels=10
[2023-04-15 01:49:50.338] [CARTA] [debug] PV preview profile 12658334 of 25317170 max num pixels=10
[2023-04-15 01:49:50.345] [CARTA] [debug] PV preview profile 12658335 of 25317170 max num pixels=9
[2023-04-15 01:49:50.349] [CARTA] [debug] PV preview profile 12658336 of 25317170 max num pixels=9
[2023-04-15 01:49:50.357] [CARTA] [debug] PV preview profile 12658337 of 25317170 max num pixels=9
...
[2023-04-15 01:49:52.062] [CARTA] [debug] PV preview profile 12658843 of 25317170 max num pixels=8
[2023-04-15 01:49:52.063] [CARTA] [debug] PV preview profile 12658844 of 25317170 max num pixels=8
[2023-04-15 01:49:52.064] [CARTA] [debug] PV preview profile 12658845 of 25317170 max num pixels=8
[2023-04-15 01:49:52.066] [CARTA] [debug] PV preview profile 12658846 of 25317170 max num pixels=10
[2023-04-15 01:49:52.067] [CARTA] [debug] PV preview profile 12658847 of 25317170 max num pixels=8
Killed: 9
it hangs for ~23 seconds, then starts showing messages with large numbers. Then it hangs again for about 2 mins. Then crashed. When it crashed, I see a temp folder in the directory where the carta_backend is launched. This time the file size is 59 GB.
UPDATE3: I think I found the way. Likely we need a large downsampled cube (close to 1GB), then have a pv cut not too short. Then like a monkey 😆, drag the pv cut around in a unusual fast speed. May drop the pv cut at some point and repeat the fast drag action to see if the backend starts hanging. When it is hanging, observe the backend log and we would see things like the above example log.
@pford In the directory where I launched the carta_backend, I see some large temp directories such as
444M TempLattice80076_0
(base) kswang@octopus build % du -h TempLattice80230_0
34G TempLattice80230_0
Not sure how I got those so far. Do you have any idea? Due to the hanging issue mentioned above so that the temp folder was not removed properly?
got a crash case but not sure how to reproduce this so far
the macOS crash report is attached.
VM Region Info: 0 is not in any region. Bytes before following region: 4531212288
REGION TYPE START - END [ VSIZE] PRT/MAX SHRMOD REGION DETAIL
UNUSED SPACE AT START
--->
__TEXT 10e14d000-10e555000 [ 4128K] r-x/r-x SM=COW ...carta_backend
Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0 carta_backend 0x10e212ded casacore::ArrayLattice<bool>::getAt(casacore::IPosition const&) const + 13 (ArrayLattice.tcc:168)
1 carta_backend 0x10e31073d carta::PvPreviewCube::GetRegionProfile(std::__1::shared_ptr<casacore::LCRegion>, casacore::ArrayLattice<bool> const&, std::__1::function<void (float)>, std::__1::vector<float, std::__1::allocator<float> >&, double&, bool&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&) + 749 (PvPreviewCube.cc:192)
2 carta_backend 0x10e3960f0 carta::RegionHandler::CalculatePvPreviewImage(int, int, bool, std::__1::shared_ptr<carta::PvPreviewCut>, std::__1::shared_ptr<carta::PvPreviewCube>, std::__1::function<void (float)>, CARTA::PvResponse&, carta::GeneratedImage&) + 2944 (RegionHandler.cc:1242)
3 carta_backend 0x10e39972b carta::RegionHandler::UpdatePvPreviewImage(int, int, bool, std::__1::function<void (CARTA::PvResponse&, carta::GeneratedImage&)>) + 1707 (RegionHandler.cc:1428)
4 carta_backend 0x10e3c9c8a carta::Session::SendPvPreview(int, int, bool) + 298 (Session.cc:1785)
5 carta_backend 0x10e3c965f carta::Session::OnSetRegion(CARTA::SetRegion const&, unsigned int, bool) + 671 (Session.cc:812)
6 carta_backend 0x10e3e727e carta::SessionManager::OnMessage(uWS::WebSocket<false, true, carta::PerSocketData>*, std::__1::basic_string_view<char, std::__1::char_traits<char> >, uWS::OpCode) + 1582 (SessionManager.cc:319)
7 carta_backend 0x10e3ef944 ofats::any_detail::any_invocable_impl<void, false, uWS::WebSocket<false, true, carta::PerSocketData>*, std::__1::basic_string_view<char, std::__1::char_traits<char> >, uWS::OpCode>::call(uWS::WebSocket<false, true, carta::PerSocketData>*, std::__1::basic_string_view<char, std::__1::char_traits<char> >, uWS::OpCode) + 17 (MoveOnlyFunction.h:247) [inlined]
8 carta_backend 0x10e3ef944 ofats::any_invocable<void (uWS::WebSocket<false, true, carta::PerSocketData>*, std::__1::basic_string_view<char, std::__1::char_traits<char> >, uWS::OpCode)>::operator()(uWS::WebSocket<false, true, carta::PerSocketData>*, std::__1::basic_string_view<char, std::__1::char_traits<char> >, uWS::OpCode) + 17 (MoveOnlyFunction.h:354) [inlined]
9 carta_backend 0x10e3ef944 uWS::WebSocketContext<false, true, carta::PerSocketData>::handleFragment(char*, unsigned long, unsigned int, int, bool, uWS::WebSocketState<true>*, void*) + 1300 (WebSocketContext.h:93)
10 carta_backend 0x10e3eeb86 bool uWS::WebSocketProtocol<true, uWS::WebSocketContext<false, true, carta::PerSocketData> >::consumeMessage<6u, unsigned char>(unsigned char, char*&, unsigned int&, uWS::WebSocketState<true>*, void*) + 758 (WebSocketProtocol.h:328)
11 carta_backend 0x10e3ee7e0 uWS::WebSocketProtocol<true, uWS::WebSocketContext<false, true, carta::PerSocketData> >::consume(char*, unsigned int, uWS::WebSocketState<true>*, void*) + 272 (WebSocketProtocol.h:424)
12 carta_backend 0x10e3ee67a auto uWS::WebSocketContext<false, true, carta::PerSocketData>::init()::'lambda'(auto*, char*, int)::operator()<us_socket_t>(auto*, char*, int) const + 138 (WebSocketContext.h:286)
13 carta_backend 0x10e4a37f5 us_loop_run + 133 (epoll_kqueue.c:147)
14 carta_backend 0x10e3eb1c7 uWS::Loop::run() + 8 (Loop.h:159) [inlined]
15 carta_backend 0x10e3eb1c7 uWS::run() + 15 (Loop.h:176) [inlined]
16 carta_backend 0x10e3eb1c7 uWS::TemplatedApp<false>::run() + 15 (App.h:393) [inlined]
17 carta_backend 0x10e3eb1c7 carta::SessionManager::RunApp() + 343 (SessionManager.cc:590)
18 carta_backend 0x10e31faa5 main + 3877 (Main.cc:189)
19 dyld 0x11cce652e start + 462
@confluence @markccchiang please proceed code review for the beta release. I will perform some other tests at the same time.
I tested this branch to current ICD-RxJS, in order to check whether there is any regression.
And I found the PV_GENERATOR_HDF5_COMPARED_FITS.test.ts
failed on this branch.
It means the PV generator of fits and hdf5 (the same image but different formats) are not consistent.
For example, I used: HD163296_CO_2_1.fits and HD163296_CO_2_1.hdf5
I generate a region:
setRegion: { fileId: 0, regionId: -1, regionInfo: { regionType: CARTA.RegionType.LINE, controlPoints: [{ x: 79, y: 77 }, { x: 362, y: 360 }], rotation: 135 } },
and a PV cut:
setPVRequest: { fileId:0, regionId:1, width:3, },
The current dev two PV image (generated by fits and hdf5) subtracted are all 0
However in this branch, two PV image subtracted are NOT all 0, indicating two PV generated images are not the same.
Because the code freeze is coming, not sure should fix this issue before the code freeze, or I can block related test and wait later to fix this issue.
I tested this branch to current ICD-RxJS, in order to check whether there is any regression. And I found the
PV_GENERATOR_HDF5_COMPARED_FITS.test.ts
failed on this branch. It means the PV generator of fits and hdf5 (the same image but different formats) are not consistent. For example, I used: HD163296_CO_2_1.fits and HD163296_CO_2_1.hdf5 I generate a region:setRegion: { fileId: 0, regionId: -1, regionInfo: { regionType: CARTA.RegionType.LINE, controlPoints: [{ x: 79, y: 77 }, { x: 362, y: 360 }], rotation: 135 } },
and a PV cut:
setPVRequest: { fileId:0, regionId:1, width:3, },
The current dev two PV image (generated by fits and hdf5) subtracted are all 0
However in this branch, two PV image subtracted are NOT all 0, indicating two PV generated images are not the same.
Because the code freeze is coming, not sure should fix this issue before the code freeze, or I can block related test and wait later to fix this issue.
@acdo2002 please file an issue in the backend repo. This can be addressed after the beta release.
@acdo2002 if you have the two images matched, do you see identical region spectral profiles?
@kswang1029 Ok, I will file an issue in the backend repo, and block one sub-test about this test in PV_GENERATOR_HDF5_COMPARED_FITS.test.ts.
Because the Z profile currently is not allow to select the line as region (if set rectangle as the region, the Z profile are the same), I only check the Profile of "SPATIAL_PROFILE_DATA" of the line, two matched images are the same (check by eye as well as rawValuesFp32).
@confluence would you be able to fix the merge conflict in the changelog and protobuf for Pam?
@confluence would you be able to fix the merge conflict in the changelog and protobuf for Pam?
Yes, it shouldn't be a problem.
Package | Line Rate | Health |
---|---|---|
src.Cache | 65% | ➖ |
src.DataStream | 52% | ➖ |
src.FileList | 67% | ➖ |
src.Frame | 50% | ➖ |
src.HttpServer | 43% | ➖ |
src.ImageData | 28% | ❌ |
src.ImageFitter | 89% | ✔ |
src.ImageGenerators | 44% | ➖ |
src.ImageStats | 76% | ✔ |
src.Logger | 44% | ➖ |
src.Main | 54% | ➖ |
src.Region | 18% | ❌ |
src.Session | 29% | ❌ |
src.Table | 52% | ➖ |
src.ThreadingManager | 87% | ✔ |
src.Timer | 85% | ✔ |
src.Util | 50% | ➖ |
Summary | 38% (6873 / 18165) | ❌ |
Implements fast PV image preview for #795 .
How does this PR solve the issue? Give a brief summary.
Are there any companion PRs (frontend, protobuf)? protobuf #82 and frontend #2100
Is there anything else that testers should know (e.g. exactly how to reproduce the issue)? Launch the PV generator widget, set parameters, and click "Start Preview".
Checklist