cekdahl / jSoupLink

HTML parser for Mathematica/Wolfram Language
43 stars 6 forks source link

`:containsData` don't work #5

Open Juddd opened 1 year ago

Juddd commented 1 year ago

As your link point: image image

I find some available Selector APIs. But why this code will report error:

obj = Import["https://stackexchange.com", "HTMLDOM"];
obj["Select", ":containsData(fkey)"]
cekdahl commented 1 year ago

Hi. What is the error message? The copy of the library in jsoupLink hasn't been updated for some time and it's possible that they have introduced new features since then. This is one possibility.

Juddd commented 1 year ago

This is the error message: Snipaste_2023-06-26_18-45-24 It's a wonderful library, but unfortunately, not many users seem to know it exists

cekdahl commented 1 year ago

It's likely that it was introduced in a later version of Jsoup, so Jsoup has to be updated for this to work.

felixkasza commented 5 months ago

@Juddd and @cekdahl, I took the liberty of forking jsoupLink (felixkasza/jsoupLink) and updating the jar file to 1.17.2; this works just fine now:

obj = Import["https://stackexchange.com", "HTMLDOM"];
obj["Select", ":containsData(fkey)"]

(I also updated the file structure and the PacletInfo to current (v12.3+) Mathematica standards.)

cekdahl commented 5 months ago

@felixkasza This looks really good. As it happens, I am actually about to release the first new version of jsoupLink in years, and this looks like it could help.

felixkasza commented 5 months ago

@cekdahl That'll be my first PR ever, then (yay?). And thanks, by the way, for jsoupLink in the first place. I love it!

cekdahl commented 5 months ago

@felixkasza I will upload my updates in the coming days, then you can make PR on top of that. I am currently taking the steps necessary to submit the package to WR's package repository.

felixkasza commented 5 months ago

Excellent, thanks! I'll cancel the current PR.

cekdahl commented 4 months ago

@felixkasza I have the updates I intended to make on a branch (some updates to e.g. the README still to come): https://github.com/cekdahl/jSoupLink/tree/feature/rel1.1

I'm using the jsoupLink.nb to build it. CreatePacletArchive@"jsoupLink" creates the paclet file, but when I run PacletInstall[%, "IgnoreVersion" -> True] then it prints the following error message:

image

Your branch had a lot of information about building the paclet in it. Do you see what's wrong?

felixkasza commented 4 months ago

Let me grab that branch and have a look.

Edited because Github's mail-to-comment conversion sucks.

felixkasza commented 4 months ago

OK, the underlying issue is the use of the PackPacklet[] function (as opposed to the newer PacletTools stuff); PackPaclet[] does not do the current style of documentation (and the way PackPacklet[] wants it is undocumented, AFAIK).

I took the liberty of bringing everything up to the current (MMA 12.3+) rules expectations and adding a new build script and manual build instructions; the PR is in #8. Conversely, I would retire the current jsoupLink.nb.

cekdahl commented 4 months ago

Thank you for the fast response. I didn't have time today, but I'll try to have a look at it tomorrow.