craigsapp / humextra

C++ programs and library for processing Humdrum data files. Best to install from https://github.com/humdrum-tools/humdrum-tools . See https://github.com/craigsapp/humlib for modernized Humdrum file parsing library.
http://extras.humdrum.org
25 stars 8 forks source link

Humdrum-Tools regression test failures #2

Closed MarcPerlman closed 9 years ago

MarcPerlman commented 9 years ago

I've been trying to install humdrum-tools on Mac OS 10.9.5.

The tools seem to install OK:

local $ which keycor /usr/local/humdrum-tools/humextra/bin/keycor local $

However, there are two (accent & infot) that fail the regression test:

image

[ ... ]

image

Are these known problems?

Thanks, Marc Perlman

craigsapp commented 9 years ago

Now they are known :-). The regression tests should now all work properly on your computer. To update the software, it is probably best to erase the humdrum-tools directory:

rm -rf humdrum-tools

and reinstall with the command:

git clone --recursive https://github.com/humdrum-tools/humdrum-tools

and then install as before. Normally typing make update in the humdrum-tools directory is sufficient to download a new version of the code, but I fixed a problem in git thinking that the "humdrum.1" man page should be spelled "Humdrum.1" which causes problems if trying to update in the normal manner...

1. Problem with accent regression test

This was a tricky problem to track down, and it was a error occurring on your computer (the test was correct). The problem was due to the backslash characters in \' (and \") in regular expressions related to assigning accents weightings to articulations. After the backslashes were removed, the tests seems to be successful. The \' was causing all notes to be marked as accent (with ART=0.95 in the sample output below). The notes were not accented and should have an ART=0.5 in the output.

Here is the input:

**kern
*clefG2
*M4/4
*k[]
*C:
=1-
4c
4d
4e
4f
=2
2g
2a
=3
2b
2cc
==
*-

Here is the correct output (as before):

**accent
*clefG2
*M4/4
*k[]
*C:
*
=1-
0.763636 AGO:1.0000 MEL:1 MET:1 DEG:1 ART:0.5 VOC:1 PHR:
0.621433 AGO:1.0000 MEL:0.33 MET:0.333333 DEG:0.548031 ART:0.5 VOC:1 PHR:
0.638972 AGO:1.0000 MEL:0.2211 MET:0.5 DEG:0.689764 ART:0.5 VOC:1 PHR:
0.620499 AGO:1.0000 MEL:0.2211 MET:0.333333 DEG:0.644094 ART:0.5 VOC:1 PHR:
=2
0.693703 AGO:1 MEL:0.2211 MET:1 DEG:0.817323 ART:0.5 VOC:1 PHR:
0.630726 AGO:1 MEL:0.2211 MET:0.5 DEG:0.576378 ART:0.5 VOC:1 PHR:
=3
0.667247 AGO:1 MEL:0.2211 MET:1 DEG:0.453543 ART:0.5 VOC:1 PHR:
0.694182 AGO:1 MEL:0.67 MET:0.5 DEG:1 ART:0.5 VOC:1 PHR:
==
*-

Here is incorrect output caused by \' in a regular expression which caused all notes to be considered accented (ART=0.95), apparently in certain version of awk and/or bash only:

**accent
*clefG2
*M4/4
*k[]
*C:
*
=1-
0.878182 AGO:1.0000 MEL:1 MET:1 DEG:1 ART:0.95 VOC:1 PHR:
0.735978 AGO:1.0000 MEL:0.33 MET:0.333333 DEG:0.548031 ART:0.95 VOC:1 PHR:
0.753517 AGO:1.0000 MEL:0.2211 MET:0.5 DEG:0.689764 ART:0.95 VOC:1 PHR:
0.735044 AGO:1.0000 MEL:0.2211 MET:0.333333 DEG:0.644094 ART:0.95 VOC:1 PHR:
=2
0.808249 AGO:1 MEL:0.2211 MET:1 DEG:0.817323 ART:0.95 VOC:1 PHR:
0.745271 AGO:1 MEL:0.2211 MET:0.5 DEG:0.576378 ART:0.95 VOC:1 PHR:
=3
0.781792 AGO:1 MEL:0.2211 MET:1 DEG:0.453543 ART:0.95 VOC:1 PHR:
0.808727 AGO:1 MEL:0.67 MET:0.5 DEG:1 ART:0.95 VOC:1 PHR:
==
*-

On my OS X 10.9.5 laptop, I was getting the correct results for some reason (maybe you are not using the default bash shell or awk version that I was using). But in any case it should now be fixed in both computer configurations since quotes should not be backslash escaped in regular expressions.

2. Problem with ekern regression test

Overall this test was successful (as you probably noticed since you did not mention it), but it output additional warning errors which do not happen on my OS X 10.9.5 computer. I have probably suppressed the errors by changing regular expressions \] into [[]. Here is the version of awk I am using in OSX 10.9.5 which might be different that yours:

$ awk --version
awk version 20070501

In Fedora 20 linux I was getting the same warning messages as you were, and the respelling of the regular expressions fixed the problem on that platform.

3. Problem with infot regression test

The problem with the infot regression test was due to the input data for the test: the output data was a set of sorted lines, but all sort values were the same. This probably caused a different random order for different computers (I ran the test on OS X 10.9.5 and got the expected order, probably since I created the test on my computer :-). A new test input file replaces the old one:

**data
A
B
C
B
A
A
A
C
B
C
B
A
A
A
*-

The expected output is

A   1.000
B   1.807
C   2.222

The regression tests are all working for me now in OS X 10.9.5 and Fedora ~20 Linux:

TEST 01 for humsed: OK
TEST 01 for tonh: OK
TEST 01 for context: OK
TEST 01 for dur: OK
TEST 01 for assemble: OK
TEST 01 for solfa: OK
TEST 01 for recode: OK
TEST 01 for cbr: OK
TEST 01 for semits: OK
TEST 01 for proof: OK
TEST 01 for xdelta: OK
TEST 01 for trans: OK
TEST 01 for nf: OK
TEST 01 for infot: OK
TEST 01 for melac: OK
TEST 01 for yank: OK
TEST 01 for cocho: OK
TEST 01 for reihe: OK
TEST 01 for specc: OK
TEST 01 for mint: OK
TEST 01 for freq: OK
TEST 01 for patt: OK
TEST 01 for solfg: OK
TEST 01 for pattern: OK
TEST 01 for barks: OK
TEST 01 for pitch: OK
TEST 01 for text: OK
TEST 01 for hint: OK
TEST 01 for strophe: OK
TEST 01 for cents: OK
TEST 01 for urrhythm: OK
TEST 01 for rend: OK
TEST 01 for deg: OK
TEST 01 for vox: OK
TEST 01 for fields: OK
TEST 01 for degree: OK
TEST 01 for accent: OK
TEST 01 for num: OK
TEST 01 for pcset: OK
TEST 01 for rid: OK
TEST 02 for rid: OK
TEST 03 for rid: OK
TEST 04 for rid: OK
TEST 05 for rid: OK
TEST 06 for rid: OK
TEST 07 for rid: OK
TEST 08 for rid: OK
TEST 09 for rid: OK
TEST 10 for rid: OK
TEST 11 for rid: OK
TEST 12 for rid: OK
TEST 13 for rid: OK
TEST 14 for rid: OK
TEST 15 for rid: OK
TEST 16 for rid: OK
TEST 17 for rid: OK
TEST 18 for rid: OK
TEST 01 for thru: OK
TEST 01 for key: OK
TEST 01 for ydelta: OK
TEST 01 for correl: OK
TEST 01 for cleave: OK
TEST 01 for metpos: OK
TEST 01 for timebase: OK
TEST 01 for pf: OK
TEST 01 for pc: OK
TEST 01 for ditto: OK
TEST 01 for humdrum: OK
TEST 02 for humdrum: OK
TEST 03 for humdrum: OK
TEST 04 for humdrum: OK
TEST 05 for humdrum: OK
TEST 06 for humdrum: OK
TEST 07 for humdrum: OK
TEST 08 for humdrum: OK
TEST 09 for humdrum: OK
TEST 10 for humdrum: OK
TEST 11 for humdrum: OK
TEST 12 for humdrum: OK
TEST 13 for humdrum: OK
TEST 14 for humdrum: OK
TEST 15 for humdrum: OK
TEST 16 for humdrum: OK
TEST 17 for humdrum: OK
TEST 18 for humdrum: OK
TEST 19 for humdrum: OK
TEST 20 for humdrum: OK
TEST 21 for humdrum: OK
TEST 22 for humdrum: OK
TEST 23 for humdrum: OK
TEST 24 for humdrum: OK
TEST 25 for humdrum: OK
TEST 26 for humdrum: OK
TEST 27 for humdrum: OK
TEST 30 for humdrum: OK
TEST 31 for humdrum: OK
TEST 32 for humdrum: OK
TEST 33 for humdrum: OK
TEST 35 for humdrum: OK
TEST 36 for humdrum: OK
TEST 37 for humdrum: OK
TEST 38 for humdrum: OK
TEST 39 for humdrum: OK
TEST 01 for synco: OK
TEST 01 for scramble: OK
TEST 01 for iv: OK
TEST 01 for veritas: OK
TEST 01 for kern: OK
TEST 01 for ekern: OK
TEST 01 for extract: OK
TEST 01 for census: OK
craigsapp commented 9 years ago

Technically this was an issue for https://github.com/humdrum-tools/humdrum, but I don't get an email when issues occur there, so luckily you sent it here and I noticed it.

MarcPerlman commented 9 years ago

Thanks!

I don't have the technical chops to swap shells, so as far as I know I'm using the default everything. Here are the versions of awk and gcc I have:

kern $ awk --version

GNU Awk 4.1.1, API: 1.1

Copyright (C) 1989, 1991-2014 Free Software Foundation.

[ ... ]

kern $ gcc --version

Configured with: --prefix=/Library/Developer/CommandLineTools/usr --with-gxx-include-dir=/usr/include/c++/4.2.1

Apple LLVM version 6.0 (clang-600.0.56) (based on LLVM 3.5svn)

Target: x86_64-apple-darwin13.4.0

Thread model: posix

kern $ g++ --version

Configured with: --prefix=/Library/Developer/CommandLineTools/usr --with-gxx-include-dir=/usr/include/c++/4.2.1

Apple LLVM version 6.0 (clang-600.0.56) (based on LLVM 3.5svn)

Target: x86_64-apple-darwin13.4.0

Thread model: posix

kern $

Best, Marc

On Mon, Feb 2, 2015 at 3:41 AM, Craig Stuart Sapp notifications@github.com wrote:

Technically this was an issue for https://github.com/humdrum-tools/humdrum, but I don't get an email when issues occur there, so luckily you sent it here and I noticed it.

Reply to this email directly or view it on GitHub https://github.com/craigsapp/humextra/issues/2#issuecomment-72422294.

Associate Professor Department of Music Brown University Providence, Rhode Island 02912 USA

http://research.brown.edu/research/profile.php?id=10308

Unplayed Melodies: Javanese Gamelan and the Genesis of Music Theory http://www.ucpress.edu/book.php?isbn=9780520239562

"From 'Folklore' to 'Knowledge' in Global Governance: On the Metamorphoses of the Unauthored" http://www.sfu.ca/kbipinch/records/2705/ http://books.google.com/books?isbn=0226907090

Absent-minded Professor Alert: I welcome follow-up messages, reminders, and nudges

craigsapp commented 9 years ago

Your awk version is the GNU version, which is why your output was matching my linux installation. OS X uses the BSD version, so you or some software package that you use installed the GNU utilties, and it took over the default awk that comes with OS X.

Apple is BSD unix, and so uses "nawk", linux distrubutions typically have GNU tools, and so "gawk". The two version are not exactly the same, but hopefully the Humdrum tools are transparent to the two different versions (difficulties always pop up, but I squash them as they appear).

http://en.wikipedia.org/wiki/AWK#Versions_and_implementations

BWK awk or nawk refers to the version by Brian Kernighan. It has been dubbed the "One True AWK" because of the use of the term in association with the book that originally described the language and the fact that Kernighan was one of the original authors of AWK.[9] FreeBSD refers to this version as one-true-awk.[10] This version also has features not in the book, such as tolower and ENVIRON that are explained above; see the FIXES file in the source archive for details. This version is used by e.g. FreeBSD, NetBSD, OpenBSD and OS X.

gawk (GNU awk) is another free software implementation and the only implementation that makes serious progress implementing internationalization and localization and TCP/IP networking. It was written before the original implementation became freely available. It includes its own debugger, and its profiler enables the user to make measured performance enhancements to a script, and it also enables the user to extend functionality via shared libraries. Linux distributions are mostly GNU software, and so they include gawk. FreeBSD before version 5.0 also included gawk version 3.0 but subsequent versions of FreeBSD use BWK awk to avoid the more restrictive GNU General Public License (GPL) license as well as for its technical characteristics.

My C/C++ compiler is the same as yours (Apple LLVM v6), but slightly older version (clang-600.0.51).