Closed sarabeckman closed 1 year ago
I believe the fix for #225 also fixed this one, but I fixed a couple typos as well. Fixed in ddr-cmdln
master
branch commit 74981bb
and pushed.
Tested on kyuzo. Files located /media/qnfs/kinura/working/ddr-ajah-8/image
(cmdln) ddr@kyuzo:/media/qnfs/kinkura/working/ddr-ajah-8/image$ ddrimport file ./ddr-ajah-8-testaccessfiles.csv /media/qnfs/kinkura/gold/ddr-ajah-8
2023-02-08 16:31:47,299 DEBUG <DDR.identifier.Identifier collection:ddr-ajah-8>
2023-02-08 16:31:47,300 INFO Checking CSV file
2023-02-08 16:31:47,300 INFO 12 rows
2023-02-08 16:31:47,302 DEBUG Starting new HTTPS connection (1): partner.densho.org:443
2023-02-08 16:31:47,530 DEBUG https://partner.densho.org:443 "GET /vocab/api/0.2/index.json HTTP/1.1" 200 None
2023-02-08 16:31:47,548 DEBUG getting vocab: https://partner.densho.org/vocab/api/0.2/genre.json
2023-02-08 16:31:47,549 DEBUG getting vocab: https://partner.densho.org/vocab/api/0.2/language.json
2023-02-08 16:31:47,550 DEBUG Starting new HTTPS connection (1): partner.densho.org:443
2023-02-08 16:31:47,551 DEBUG getting vocab: https://partner.densho.org/vocab/api/0.2/facility.json
2023-02-08 16:31:47,551 DEBUG Starting new HTTPS connection (1): partner.densho.org:443
2023-02-08 16:31:47,553 DEBUG getting vocab: https://partner.densho.org/vocab/api/0.2/format.json
2023-02-08 16:31:47,553 DEBUG Starting new HTTPS connection (1): partner.densho.org:443
2023-02-08 16:31:47,554 DEBUG Starting new HTTPS connection (1): partner.densho.org:443
2023-02-08 16:31:47,555 DEBUG getting vocab: https://partner.densho.org/vocab/api/0.2/rights.json
2023-02-08 16:31:47,557 DEBUG Starting new HTTPS connection (1): partner.densho.org:443
2023-02-08 16:31:47,558 DEBUG getting vocab: https://partner.densho.org/vocab/api/0.2/public.json
2023-02-08 16:31:47,560 DEBUG Starting new HTTPS connection (1): partner.densho.org:443
2023-02-08 16:31:47,560 DEBUG getting vocab: https://partner.densho.org/vocab/api/0.2/status.json
2023-02-08 16:31:47,562 DEBUG Starting new HTTPS connection (1): partner.densho.org:443
2023-02-08 16:31:47,577 DEBUG getting vocab: https://partner.densho.org/vocab/api/0.2/topics.json
2023-02-08 16:31:47,579 DEBUG Starting new HTTPS connection (1): partner.densho.org:443
2023-02-08 16:31:47,600 DEBUG https://partner.densho.org:443 "GET /vocab/api/0.2/language.json HTTP/1.1" 200 None
2023-02-08 16:31:47,608 DEBUG https://partner.densho.org:443 "GET /vocab/api/0.2/format.json HTTP/1.1" 200 None
2023-02-08 16:31:47,617 DEBUG https://partner.densho.org:443 "GET /vocab/api/0.2/public.json HTTP/1.1" 200 None
2023-02-08 16:31:47,619 DEBUG https://partner.densho.org:443 "GET /vocab/api/0.2/rights.json HTTP/1.1" 200 None
2023-02-08 16:31:47,624 DEBUG https://partner.densho.org:443 "GET /vocab/api/0.2/genre.json HTTP/1.1" 200 None
2023-02-08 16:31:47,626 DEBUG https://partner.densho.org:443 "GET /vocab/api/0.2/facility.json HTTP/1.1" 200 None
2023-02-08 16:31:47,638 DEBUG https://partner.densho.org:443 "GET /vocab/api/0.2/status.json HTTP/1.1" 200 None
2023-02-08 16:31:47,642 DEBUG https://partner.densho.org:443 "GET /vocab/api/0.2/topics.json HTTP/1.1" 200 None
2023-02-08 16:31:47,683 INFO Validating headers
2023-02-08 16:31:47,683 INFO Validating rows
2023-02-08 16:31:47,687 INFO Validating file imports
2023-02-08 16:31:47,687 INFO Checking repository
2023-02-08 16:31:47,695 INFO <git.repo.base.Repo '/media/qnfs/kinkura/gold/ddr-ajah-8/.git'>
2023-02-08 16:31:47,695 DEBUG Popen(['git', 'diff', '--cached', '--name-only'], cwd=/media/qnfs/kinkura/gold/ddr-ajah-8, universal_newlines=False, shell=None, istream=None)
Traceback (most recent call last):
File "/opt/ddr-cmdln/venv/cmdln/bin/ddrimport", line 33, in <module>
sys.exit(load_entry_point('ddr-cmdln==5.6.1', 'console_scripts', 'ddrimport')())
File "/opt/ddr-cmdln/venv/cmdln/lib/python3.9/site-packages/click/core.py", line 1130, in __call__
return self.main(*args, **kwargs)
File "/opt/ddr-cmdln/venv/cmdln/lib/python3.9/site-packages/click/core.py", line 1055, in main
rv = self.invoke(ctx)
File "/opt/ddr-cmdln/venv/cmdln/lib/python3.9/site-packages/click/core.py", line 1657, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/opt/ddr-cmdln/venv/cmdln/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/opt/ddr-cmdln/venv/cmdln/lib/python3.9/site-packages/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "/opt/ddr-cmdln/venv/cmdln/lib/python3.9/site-packages/ddr_cmdln-5.6.1-py3.9.egg/DDR/cli/ddrimport.py", line 207, in file
run_checks(
File "/opt/ddr-cmdln/venv/cmdln/lib/python3.9/site-packages/ddr_cmdln-5.6.1-py3.9.egg/DDR/cli/ddrimport.py", line 304, in run_checks
staged,modified = batch.Checker.check_repository(ci)
File "/opt/ddr-cmdln/venv/cmdln/lib/python3.9/site-packages/ddr_cmdln-5.6.1-py3.9.egg/DDR/batch.py", line 182, in check_repository
return dvcs.list_staged(repo), dvcs.list_modified(repo)
File "/opt/ddr-cmdln/venv/cmdln/lib/python3.9/site-packages/ddr_cmdln-5.6.1-py3.9.egg/DDR/dvcs.py", line 282, in list_staged
stdout = repo.git.diff('--cached', '--name-only')
File "/opt/ddr-cmdln/venv/cmdln/lib/python3.9/site-packages/git/cmd.py", line 741, in <lambda>
return lambda *args, **kwargs: self._call_process(name, *args, **kwargs)
File "/opt/ddr-cmdln/venv/cmdln/lib/python3.9/site-packages/git/cmd.py", line 1315, in _call_process
return self.execute(call, **exec_kwargs)
File "/opt/ddr-cmdln/venv/cmdln/lib/python3.9/site-packages/git/cmd.py", line 1109, in execute
raise GitCommandError(redacted_command, status, stderr_value, stdout_value)
git.exc.GitCommandError: Cmd('git') failed due to: exit code(129)
cmdline: git diff --cached --name-only
stderr: 'error: unknown option `cached'
usage: git diff --no-index [<options>] <path> <path>
Diff output format options
-p, --patch generate patch
-s, --no-patch suppress diff output
-u generate patch
-U, --unified[=<n>] generate diffs with <n> lines context
-W, --function-context
generate diffs with <n> lines context
--raw generate the diff in raw format
--patch-with-raw synonym for '-p --raw'
--patch-with-stat synonym for '-p --stat'
--numstat machine friendly --stat
--shortstat output only the last line of --stat
-X, --dirstat[=<param1,param2>...]
output the distribution of relative amount of changes for each sub-directory
--cumulative synonym for --dirstat=cumulative
--dirstat-by-file[=<param1,param2>...]
synonym for --dirstat=files,param1,param2...
--check warn if changes introduce conflict markers or whitespace errors
--summary condensed summary such as creations, renames and mode changes
--name-only show only names of changed files
--name-status show only names and status of changed files
--stat[=<width>[,<name-width>[,<count>]]]
generate diffstat
--stat-width <width> generate diffstat with a given width
--stat-name-width <width>
generate diffstat with a given name width
--stat-graph-width <width>
generate diffstat with a given graph width
--stat-count <count> generate diffstat with limited lines
--compact-summary generate compact summary in diffstat
--binary output a binary diff that can be applied
--full-index show full pre- and post-image object names on the "index" lines
--color[=<when>] show colored diff
--ws-error-highlight <kind>
highlight whitespace errors in the 'context', 'old' or 'new' lines in the diff
-z do not munge pathnames and use NULs as output field terminators in --raw or --numstat
--abbrev[=<n>] use <n> digits to display object names
--src-prefix <prefix>
show the given source prefix instead of "a/"
--dst-prefix <prefix>
show the given destination prefix instead of "b/"
--line-prefix <prefix>
prepend an additional prefix to every line of output
--no-prefix do not show any source or destination prefix
--inter-hunk-context <n>
show context between diff hunks up to the specified number of lines
--output-indicator-new <char>
specify the character to indicate a new line instead of '+'
--output-indicator-old <char>
specify the character to indicate an old line instead of '-'
--output-indicator-context <char>
specify the character to indicate a context instead of ' '
Diff rename options
-B, --break-rewrites[=<n>[/<m>]]
break complete rewrite changes into pairs of delete and create
-M, --find-renames[=<n>]
detect renames
-D, --irreversible-delete
omit the preimage for deletes
-C, --find-copies[=<n>]
detect copies
--find-copies-harder use unmodified files as source to find copies
--no-renames disable rename detection
--rename-empty use empty blobs as rename source
--follow continue listing the history of a file beyond renames
-l <n> prevent rename/copy detection if the number of rename/copy targets exceeds given limit
Diff algorithm options
--minimal produce the smallest possible diff
-w, --ignore-all-space
ignore whitespace when comparing lines
-b, --ignore-space-change
ignore changes in amount of whitespace
--ignore-space-at-eol
ignore changes in whitespace at EOL
--ignore-cr-at-eol ignore carrier-return at the end of line
--ignore-blank-lines ignore changes whose lines are all blank
-I, --ignore-matching-lines <regex>
ignore changes whose all lines match <regex>
--indent-heuristic heuristic to shift diff hunk boundaries for easy reading
--patience generate diff using the "patience diff" algorithm
--histogram generate diff using the "histogram diff" algorithm
--diff-algorithm <algorithm>
choose a diff algorithm
--anchored <text> generate diff using the "anchored diff" algorithm
--word-diff[=<mode>] show word diff, using <mode> to delimit changed words
--word-diff-regex <regex>
use <regex> to decide what a word is
--color-words[=<regex>]
equivalent to --word-diff=color --word-diff-regex=<regex>
--color-moved[=<mode>]
moved lines of code are colored differently
--color-moved-ws <mode>
how white spaces are ignored in --color-moved
Other diff options
--relative[=<prefix>]
when run from subdir, exclude changes outside and show relative paths
-a, --text treat all files as text
-R swap two inputs, reverse the diff
--exit-code exit with 1 if there were differences, 0 otherwise
--quiet disable all output of the program
--ext-diff allow an external diff helper to be executed
--textconv run external text conversion filters when comparing binary files
--ignore-submodules[=<when>]
ignore changes to submodules in the diff generation
--submodule[=<format>]
specify how differences in submodules are shown
--ita-invisible-in-index
hide 'git add -N' entries from the index
--ita-visible-in-index
treat 'git add -N' entries as real in the index
-S <string> look for differences that change the number of occurrences of the specified string
-G <regex> look for differences that change the number of occurrences of the specified regex
--pickaxe-all show all changes in the changeset with -S or -G
--pickaxe-regex treat <string> in -S as extended POSIX regular expression
-O <file> control the order in which files appear in the output
--find-object <object-id>
look for differences that change the number of occurrences of the specified object
--diff-filter [(A|C|D|M|R|T|U|X|B)...[*]]
select files by diff type
--output <file> Output to a specific file
That's an interesting one. Guess it's not fixed after all...
The git.exc.GitCommandError
is because /media/qnfs/kinkura/gold
was owned by ansible.ansible
for some reason. The error above doesn't say anything about permissions but it goes away when you chown -R ddr.ddr
the repo.
This is fixed on the ddr-cmdln
develop
branch as of commit a403f7ac8c
.
When loading objects from CSV the Identifier.basepath
is not set. I believe my thinking was they're in a working directory and not in their final location i.e. the repository path.
DDR.models.common.load_csv
compares field values in rowd
objects (objects from a row in the CSV) with existing ones to mark field values that are modified.
Identifier
objects are considered to be non-equal if their values of path_abs()
are different.
In this case, DDR.identifier.MissingBasepathException
is triggered because Identifier.path_abs()
requires a basepath
value, which has not been set on the rowd
object's Identifier
.
The fix is to modifiy DDR.models.common.load_csv
to set a temporary basepath
just before doing this comparison.
(I also added a note to the DDR.identifier.Identifier.__eq__
documentation noting what DDR.models.common.load_csv
is doing.)
A better fix would be to modify DDR.identifier.Identifier.__eq__
to accept an ignore_basepath
argument, but that did not work in testing.
Part of the oral history workflow is using the
ddrimport file
update feature to include access images for the external files. I export the file csv add an "access_path" column for the signature image then import the CSV.Once again the new validation step is triggered when the original files weren't in the repository.
I also got a new error message once I tried to run the import command with the files in the repository.
ddr-ajah-8-accessfiles.csv