Closed bondjimbond closed 2 years ago
To possibly assist with debugging, note that the error created by preg_match() happens here: https://github.com/MarcusBarnes/mik/blob/12cdadd9a6e334043679f064f32ecd6c146842b7/src/filegetters/CsvNewspapers.php#L161
@MarcusBarnes do you think it's a code problem, or likely something I'm missing in the .ini
file? I can't see anything wrong with my CSV.
Not sure yet. I'll share if I have any leads to for you to follow up on.
Is it applying the directory regex to the metadata for some reason?
\"key\":\"46\",\"Directory\":\"1946-11-20\",\"Identifier\":\"ASMN-001216\
It looks like it's inserting the DIRECTORY_SEPARATOR
in between the metadata keys and values for reasons I can't grasp.
@bondjimbond Based on the second error message, would you check that the filename and access permissions on /Volumes/UFV_FILES/UFV-ASMN-1946/1946/ that corresponds to key 46 are all correct and that there's nothing there that is suspicious?
Another approach to debugging is adding some print_r statements just before the https://github.com/MarcusBarnes/mik/blob/12cdadd9a6e334043679f064f32ecd6c146842b7/src/filegetters/CsvNewspapers.php#L161 for $path and $directory_regex see if there's anything there that would cause problems? For $directory_regex, we use '#' as the regex delimiter - is that interacting with the file path corresponding to key 46 (or key 45) in some way?
Hm, I tried print_r
both inside and outside of the loop, but nothing printed.
File permissions all look OK: -rwxrwxrwx
@bondjimbond Try using dump() instead? https://github.com/MarcusBarnes/mik/blob/master/src/utilities/Dumper.php
How does that work? If I try dump($path)
I get fatal error: Uncaught Error: Call to undefined function mik\filegetters\dump()
OK, var_dump
worked. Here are the results for $path
and $directory_regex
:
string(67) "/Volumes/UFV_FILES/UFV-ASMN-1946/1946/1946-01-09/1946-01-09-001.tif"
string(18) "\#/1946\-01\-09/\#"
string(67) "/Volumes/UFV_FILES/UFV-ASMN-1946/1946/1946-01-09/1946-01-09-001.tif"
string(18) "\#/1946\-01\-16/\#"
[etc]
So based on this error: Delimiter must not be alphanumeric or backslash
, is MIK interpreting the backslash that is added by the regex function as a delimiter rather than as an escape character?
The last time I sued CsvNewspapers it worked -- wondering if the problem comes from a commit between then and now?
I see there's a commit that affects the directory path here: https://github.com/MarcusBarnes/mik/commit/47884d71ad16c44a4dbbbc98cf58edd0879cdc55#diff-35c8aa40361be605efe8d8fa1a65bf7affad5853bfa9e003c5d318196b7f4f8c
Wondering if that might be the cause..
No -- tried reverting that file to the older state, continuing to get the same error.
I also tried rewriting the .ini
file from scratch, no change.
The problem has to be here, right? "directory_regex":"\\#1946\\-01\\-09\\#"
I added a before-and-after var_dump
to see what happens to the $directory_regex
variable here:
$directory_regex = '#' . DIRECTORY_SEPARATOR . $issue_directory . DIRECTORY_SEPARATOR . '#';
var_dump($directory_regex);
$directory_regex = preg_quote($directory_regex);
var_dump($directory_regex);
and this is what I see:
string(14) "#/1946-12-11/#"
string(18) "\#/1946\-12\-11/\#"
string(14) "#/1946-12-18/#"
string(18) "\#/1946\-12\-18/\#"
While in mik.log
the backslash -- which is intended to escape the hyphen -- is doubled up. So it looks like the escaping happens twice somehow, which is probably the cause of this problem.
SUCCESS
I removed the preg_quote
line, and now MIK runs successfully.
Strange error coming up when I try to process a Newspaper CSV. I can't understand what's wrong with my delimiters. It looks like there's some hidden issue in my CSV file perhaps. Any ideas?
CSVs.zip newspapers.ini.zip