Closed Baltazar500 closed 2 years ago
Hello Baltazar500,
The string to enter for --download
is actually an "extended string", but without the opening x"
and closing "
.
So with --download '{...}'
you can insert every variable or function like you would for -e
/--extract
.
Related (duplicate?) issue: https://github.com/benibela/xidel/issues/38.
@Reino17, Thanks. It works. But only for single expressions. When using follow page by page I get error after file download
xidel -f 'very-long-expression' --download '{replace($url, "^.*/", "")}' -f '//div[@file="text"]/a/@href'
Save as: List#1.txt
Error:
err:XPTY0004: Need context item that is a node to get root element
Possible backtrace:
$08120890 TXQUERYENGINE__EVALUATESINGLESTEPQUERY, line 9358 of /home/benito/hg/components/pascal/data/xquery.pas: perhaps TXQTermTryCatch + 136624 ? but unlikely
$080F5547 TXQTERMPATH__EVALUATE, line 3302 of /home/benito/hg/components/pascal/data/xquery_terms.inc: perhaps TXQTermBinaryOp + 3959 ? but unlikely
$080E090A TXQUERY__EVALUATE, line 7524 of /home/benito/hg/components/pascal/data/xquery.pas: perhaps Q{http://www.w3.org/2005/xpath-functions}concat + 41114 ? but unlikely
$080E0A3D TXQUERY__EVALUATE, line 7549 of /home/benito/hg/components/pascal/data/xquery.pas: perhaps Q{http://www.w3.org/2005/xpath-functions}concat + 41421 ? but unlikely
$08080A64 TPROCESSINGCONTEXT__EVALUATEQUERY, line 2218 of xidelbase.pas: perhaps ? ? but unlikely
$0808002C SUBPROCESS, line 2062 of xidelbase.pas: perhaps ? ? but unlikely
$0807F64C TPROCESSINGCONTEXT__PROCESS, line 2079 of xidelbase.pas: perhaps ? ? but unlikely
$080801F0 PROCESSFOLLOWTO, line 1998 of xidelbase.pas: perhaps ? ? but unlikely
$08080061 SUBPROCESS, line 2065 of xidelbase.pas: perhaps ? ? but unlikely
$0807F8D1 TPROCESSINGCONTEXT__PROCESS, line 2098 of xidelbase.pas: perhaps ? ? but unlikely
$0808A3BB PERFORM, line 3891 of xidelbase.pas: perhaps ? ? but unlikely
$080493D9 main, line 84 of xidel.pas: perhaps ? ? but unlikely
Call xidel with --trace-stack to get an actual backtrace
When using "download" after following to the next page
xidel -f 'very-long-expression' -f '//div[@file="text"]/a/@href' --download '{replace($url, "^.*/", "")}'
file is not downloaded and I get an error
That's because, as the error-message mentions, there's no context item. You didn't provide input (a file or an url).
If the download name is a directory, it uses the name from the URL. So you can do --download .
And the last option is better not a -f
@Reino17
That's because, as the error-message mentions, there's no context item. You didn't provide input (a file or an url).
When using a 'very-long-expression' as an extraction (-e), I get links from each following page. When using follow (and download) it ends on the first page :(
@benibela
If the download name is a directory, it uses the name from the URL. So you can do --download .
This works like expression --download '{replace($url, "^.*/", "")}'
, but only the file from the base link is loaded. The next (follow) page does not load and throws an error.
Error: err:XPTY0004: Need context item that is a node to get root element
When using a 'very-long-expression' as an extraction (-e), I get links from each following page. When using follow (and download) it ends on the first page :(
Try it with some input
xidel '<start/>' -f 'very-long-expression' -f '//div[@file="text"]/a/@href' --download '{replace($url, "^.*/", "")}'
@benibela, Sorry, the site I'm extracting data from has stopped working. After he resumes work, I will check this trick. Thanks :)
My problem was solved by using a loop "[ -f xxx ]"
xidel [-f 'very-long-expression' --download '{replace($url, "^.*/", "")}' ] -f '//div[@file="text"]/a/@href'
How to download file with original name or content-disposition using xidel without curl/wget ? The "--download=" switch allows you to save the file only with the specified file name :(