Closed vipulg13 closed 3 years ago
I am sorry. I should have posted this issue to the fulltext team, as they provide the download of science direct articles in xml format. Anyway, I am using both "fulltext" and "rscopus" packages for my research project. Actually, I am searching for ways to get the data objects for the tables that are embedded inside the scopus provided open access articles. As per my knowledge, the existing function "download_object()" only provides support in extracting image objects, and other gif objects, but not to the table objects.
Sorry - without a reproducible example, I can't really help.
Here is a reproducible example:
doi <- "10.1016/j.cja.2020.04.031"
art <- article_retrieval(id = "10.1016/j.cja.2020.04.031",
identifier = "doi",
view = "FULL",
verbose = F)
lstArt <- art$content$full-text-retrieval-response
please note that this lstArt includes a list of objects inside the list element "objects", which further contains a path to download the embedded images. The similar functionality is missing in case of embedded tables. For example, in the same research article, there are two tables embedded. The data of these table are transformed into raw text along with the text data, available under lstArt$originalText. The retrieval and restructuring of the table data from the raw text appear to be very complicated. Is there any way to retrieve these tables at the API level and provide them as an accessible R object?
Let me know if you require further information from my side.
Does it indicate that these are available at https://dev.elsevier.com/documentation/ArticleRetrievalAPI.wadl?
I don't think it embeds tables in there:
obj <- object_retrieval(id = "10.1016/j.cja.2020.04.031",
identifier = "doi")
df = jsonlite::fromJSON(
httr::content(obj$get_statement, as= "text"),
flatten = TRUE)
that's true. These are media objects containing figures, images, formulas, etc. I will raise my concern to the developers of Elsevier. If they add this new feature to their API, then I will get in touch with you to get this integrated in rscopus package. Thanks for your support so far!
Which XML tables are you referring to?
If you have a question, please provide a MCVE: https://stackoverflow.com/help/mcve. In any example, I recommend using a reproducible example using the reprex package (https://github.com/tidyverse/reprex). Also, please include a sessioninfo::session_info() output.