Closed JSB97 closed 9 years ago
Hi, do you use the CRAN or the github version? Since you mentioned that "//table[6]" worked I assume it has to be an older version. I would suggest to try out the most recent version from github, as it appears to render all the tables correctly.
In a loop "table.df <- htmltab(doc = html.file,which=i)" should now work correctly. As you mentioned the real tables are number 2,4 and 6. I would suggest using a more stable xpath other than the position, maybe based on class, id attribute or number of tr children. And please note, that which = "//table[6]" doesn't necessarily identify the same table as which = 6, since tables can be nested.
I was using version 0.5.0 and just upgraded to 0.6.0 from CRAN - using "which" now seems to work. Thank you for the suggestion
I am parsing tables from here; https://www.dropbox.com/s/dkh2b7qifjhj0sc/0103010_honbun_jpcrp030000-asr-001_E01814-000_2015-03-31_01_2015-06-22_ixbrl.htm?dl=0
Using the following code; table.df <- htmltab(doc = html.file,which=i)
where i=1,2,...6.
For i=2,4 the tables are parsed correctly, 1,3,5 are empty tables but for some reason i=6 returns this error; Error: Couldn't identify table. Try passing (a different) information to the which argument.
For some reason, if i isolate the html for this table and store it in a new file, then htmltab is able to return the table as expected.
Also - if i try the below, i can parse the table;
htmltab(doc = html.file,which="//table[6]")
Could you please shed light on why the first syntax I tried fails where as the last seems to be ok?