kbss-cvut / aircraft-maintenance-planning-system

D2020+ project about Aircraft Mainenance Planning System.
GNU Lesser General Public License v3.0
0 stars 0 forks source link

wo-tc-ref is not working well because quote character is used #107

Closed blcham closed 1 year ago

blcham commented 1 year ago

The goal is to find out if we are getting valid CSV as input or not (see https://www.rfc-editor.org/rfc/rfc4180). Based on that we should fix/change the implementation of the pipeline.

I saw a line where value of cell starts with quotes (see "PAINT):

OH-LZM  5.657534        OH-LZM/H-22 HMV1        5880114 M       O       05-40           ZL-522-01-1     24.10.2022              5       "PAINT    RSTOR EXTERNAL PAINT AS PER REF/A/ PARA 2.A.(17)"

On the other hand we have " (quote) used inside:

M-ABOI  12.347945       M-ABOI/H-22 12Y 5813318 M       C       53-80           53-800-00       01.06.2022      24.09.2022      1       SHM:  DURING GVI INSPECTION WAS FOUND BURN MARK ON FUSELAGE SKIN IN FASTENER HEAD  POSITION: STA 188-196,5  /  10" INCH UNDER  STR - 24R /      NDT: HFEC INSPECTION OF THE SURFACE AND HFEC INSPECTION OF OPEN HOLE HAS BEEN PERFORMED ACC. TO NDTM 51-00-00 PROCEDURE 4 AND PROCEDURE 16, PART 6, RE

By the way i checked how it is implemented in CSV processor and I think it does not support quotes inside of text ....

Matthew-Kulich commented 1 year ago

From the picture below I understand that It is allowed to have a delimiter in the value

The first example (I saw a line where value of cell starts with quotes (see "PAINT):) wrapped in the double quotes is valid but the second one is incorrect, it should be wrapped in double quotes, and the quote inside should be escaped by another one.

image image
blcham commented 1 year ago

yes but it says: "if double-quotes are used to enclose fields, then a double-quote appearing inside a field must be escaped by preceding it with another double quote."

So to me, it seems that their export was valid. See also: https://stackoverflow.com/questions/28669201/avoid-double-quote-confusion-when-generating-csv-files

Matthew-Kulich commented 1 year ago

I see, but if you look at the non-escaped = *textdata , text data does not include a double quote in the range

blcham commented 1 year ago

Yes, but it is related to CSV standard and they send us TSV :( Here i another document: https://www.iana.org/assignments/media-types/text/tab-separated-values.

blcham commented 1 year ago

@Matthew-Kulich I found it !!! the solution for SuperCSV is described here (quite funny to read whole post:): https://stackoverflow.com/a/15213005/6812609

blcham commented 1 year ago

Also please upgrade to the newest version (2.4.0) of supercsv, it has better error messages e.g.: https://github.com/super-csv/super-csv/issues/33

blcham commented 1 year ago

@Matthew-Kulich should we close this issue ?

Matthew-Kulich commented 1 year ago

Yes, we can.