PIC-IRIS / PH5

Library of PH5 clients, apis, and utilities
Other
15 stars 9 forks source link

[FEATURE] Disable pforma for RT130 and Geode/Stratavisor data #530

Open hrotman-pic opened 4 months ago

hrotman-pic commented 4 months ago

Is your feature request related to a problem? Please describe. Loading RT130 data or Geode/Stratavisor (multichannel) data with pforma is highly likely to result in an invalid archive.

Describe the solution you'd like Only allow ingestion of these data via command line. These formats are rarely used in PH5 and data sizes are very likely to be relatively small.

Describe alternatives you've considered The RT130 problem is related to response table. The multichannel data is related to data from the same DAS S/N not all being in the same mini. It would be possible to open issues requesting relevant modifications, but given the possible number of affected datasets (low) I think removing the ability to ingest these formats with pforma is the most efficient/viable solution.

Additional context Ingestion error messages for RT130 data when using pforma are somewhat different than error messages when using command line. It may be useful to permit a 'debug only mode' for RT130 data, or pforma and terminal output that the PH5 should not be used/1302ph5 log is for debugging purposes only. But again, the relevant number of datasets are likely very small and it may not be worth trying to do this.

hrotman-pic commented 3 weeks ago

Per today's meeting I will provide test cases and a comparison. Any test case for RT130s requires data from multiple RT130s.

A suggested solution is: If pforma encounters a filename in the input list that matches expectations for the RT130 format (.cf, .ref, .zip, .tar extension) or SEG2 format (.dat extension), then Do not add files, print to lower left corner of pforma and to stdout: "RT130 or SEG2 data detected, exit and add data to PH5 with 130toph5 or seg2toph5." I am not sure which pforma code in PH5 is the place for these changes: pformagui, pforma_io, or something else. The filename extensions above should be case-insensitive.

hrotman-pic commented 3 weeks ago

I have attached the file with the testing I previously did for RT130 data. Each test lists the response_table_n_i found in the DAS table for each DAS, and on the bottom row lists the response_table_n_i found in the actual Response_t. I suggest looking at the SHIRE tests. The test PH5s on the server are at holly_test/rt130_ingestion/SHIRE and PH5_Archive_pforma_v2 & PH5_Archive_cmd have the greatest difference in the Response_t. I think what is happening is the Response_t that is populated in Sigma is from processing directory A and is not from all processing directories. The SHIRE tests, and other tests in the attached file, were performed to compare response_table_n_i only--because that is the bug--and do not have anything else completed. Trying to proceed with the PH5, specifically at resp_load, results in the Response_t being deleted from the PH5 and DAS table not found warnings for DAS S/Ns that have a response_table_n_i that is not in the Response_t. If you require a PH5 that has been through those steps I likely have one or two still on the server.

If there is a solution for the RT130 part of this issue that involves RT130 data being added with pforma, we should consider .cf card directories being allowed because that is how RT130 data are offloaded now. And, if DAS table not found occurs at resp_load and the DAS table not found is a RT130, that should prompt an error message instead of a warning message to discourage users from proceeding.

I do not have a comparison file for the Geode/Stratavisor SEG2 data, but there is a test at holly_test/Holly_seg2_test/post_update_ph52/Sigma. If you examine each miniPH5 you will find data from one DAS (0000SV01, 0000SV02, etc.) in more than one mini file. Data from multiple DAS are in the same raw file: the file 15001.dat, for example, has DAS 0000SV01, DAS 0000SV02, DAS 0000SV03, and so on. So in this test PH5 it seems like all the data from 15001.dat goes in miniPH5_00001.ph5, all the data from 15002.dat goes in miniPH5_00002.ph5, and so on.

PH5_RT130_ingestion_testing.xlsx

timronan commented 1 week ago

Please output this message: "RT130 or SEG2 data detected, exit and add data to PH5 with 130toph5 or seg2toph5."