bryanhanson / readJDX

Import spectroscopic data in the JCAMP-DX format
https://bryanhanson.github.io/readJDX/
8 stars 2 forks source link

First value on a line should equal to last value on the prev. line only if DIF used #6

Closed djacob65 closed 5 years ago

djacob65 commented 5 years ago

HI,

I found an issue when the first value on a line is not equal to the last value on the previous line AND the last enoded method is "SQZ" on the previous line. It In this case, it should not generate an error. However, this is what happens with the readJDX package

Indeed, as mentions : http://old.iupac.org/jcamp/protocols/dxir01.pdf

(5.8.2) Y-VALUE CHECK. In DIF mode, a Y-value check is performed at the beginning of each line to guard against a single error which would invalidate all following values. When, and only when, the last ordinate of a line is in DIF form, the abscissa does not advance before the next line. The first ordinate of the next line (after the leading abscissa value) is always an actual value, equal to the last calculated ordinate of the previous line.

Here is an exemple (obtained with my own R script)

Line1: [,1] [,2] [,3] [,4] [,5] ... [,14] [,15]
M "SQZ" "DIF" "DIF" "DIF" "DIF" ... "DIF" "DIF"
C "c4644" "J813" "n044" "L757" "k598" ... "j537" "J0687" X "32767" "32766" "32765" "32764" "32763" ... "32754" "32753" Y "-34644" "-32831" "-37875" "-34118" "-36716" ... "-39762" "-29075"

Line2: [,1] [,2] [,3] [,4] [,5] ... [,13] [,14]
M "SQZ" "SQZ" "DIF" "DIF" "DIF" ... "DIF" "DIF"
C "b9075" "a0817" "k571" "J683" "m474" ... "m357" "n362"
X "32753" "32752" "32751" "32750" "32749" ... "32741" "32740" Y "-29075" "-10817" "-13388" "-11705" "-16179" ... "-20486" "-25848"

Line3: [,1] [,2] [,3] [,4] [,5] ... [,13] [,14]
M "SQZ" "DIF" "DIF" "SQZ" "DIF" ... "DIF" "DIF"
C "b5848" "N433" "P636" "A1265" "J3455" ... "K022" "j8793" X "32740" "32739" "32738" "32737" "32736" ... "32728" "32727" Y "-25848" "-20415" "-12779" " 11265" " 24720" ... "-21318" "-40111"

Line4: [,1] [,2] [,3] [,4] [,5] ... [,14] [,15]
M "SQZ" "DIF" "SQZ" "DIF" "DIF" ... "DIF" "SQZ"
C "d0111" "o585" "a5769" "q418" "P55" ... "k00" "c444"
X "32727" "32726" "32725" "32724" "32723" ... "32714" "32713" Y "-40111" "-46696" "-15769" "-24187" "-23432" ... " 5474" " -3444"

Line5: [,1] [,2] [,3] [,4] [,5] ... [,14] [,15]
M "SQZ" "DIF" "DIF" "DIF" "SQZ" ... "DIF" "DIF"
C "f217" "k1087" "N322" "n87" "F542" ... "J3047" "J2087" X "32712" "32711" "32710" "32709" "32708" ... "32699" "32698" Y " -6217" "-27304" "-21982" "-22569" " 6542" ... " 14157" " 26244"

When the last encoded method is "DIF", we have the first values equal to the last value on the previous line; For line4, the last encoded method is SQZ, so the first values line5 is NOT equal to the last value on the line4, and in this case, it is the expected result.

With the readJDX package, and with the same example, we have :

Processing JDX

Processing real data...

Y value check failed; nearby values: Line FirstYonLine LastYonPrevLine Problem 2 2060 -29075 -29075
3 2061 -25848 -25848
4 2062 -40111 -40111
5 2063 -6217 -3444 *

Cheers Daniel

bryanhanson commented 5 years ago

Thanks Daniel for your careful report. I'm actually in the process of updating the package, so I'll investigate your report and see if I can make readJDX behave. It will probably be a week or two however. Testing is much easier if I have the file that causes the issue. If at all possible please e-mail to me hanson@depauw.edu I won't share your file with anyone.

bryanhanson commented 5 years ago

Hi Daniel... Thank you so much for reporting the issue and sharing the files. Your report led to the fix of some long standing issues. I had a major misunderstanding about when the y value check should be carried out. In the process of rooting this out, I discovered several other issues which were fixed. Things are much more robust now.

I'll be e-maling you the file I extracted from the file you shared. It contains the real data. The devel version of readJDX reads it successfully. I'd appreciate if you could test as well. You'll need the devel version:

install.packages("devtools")
library("devtools")
install_github(repo = "bryanhanson/readJDX@devel")
library("readJDX")

This version tests successfully against 71 files that I have, from a wide range of instruments and vendors. It fails on a few other files, but I have a pretty good idea of what might be going on. I'll be working on it in the next few days, longer if necessary.

Thanks, Bryan

djacob65 commented 5 years ago

Hi Bryan, I've tested with your dev version, and indeed, it works ! We obtain the right values now.

Thanks Daniel

bryanhanson commented 5 years ago

Many thanks!

bryanhanson commented 4 years ago

For completeness, it turns out that a sequence of ASDF codes like ... DIF DUP DUP (end) must also be treated as being in DIF mode and the Y value check carried out, along with the removal of the extra Y value. Discovered this during testing of an extensively re-written readJDX, to be released soon.