guido-s / meta

Official Git repository of R package meta
http://cran.r-project.org/web/packages/meta/index.html
GNU General Public License v2.0
82 stars 32 forks source link

fixing read.rm5 #27

Closed FBartos closed 3 years ago

FBartos commented 3 years ago

Hi,

I wanted to read some Cochrane data using the read.rm5 function but it didn't seem to be working. I found out that there must've been a few changes in the file structure. From my use case, it seems like these changes to the function make it work again.

Cheers, Frantisek

guido-s commented 3 years ago

Hi Frantisek,

Thank you for your suggestions. I created my version of read.rm5() to import about 8000 XML files with Cochrane reviews.

I just re-run the import using your version and got an error: _Malformed XML file (tag: DICHDATA)

Your version does not work with Revman-files having a footnote for the outcome data:

<DICH_DATA CI_END="6.679724706062527" CI_START="0.8798193922816696" EFFECT_SIZE="2.4242424242424243" ESTIMABLE="YES" EVENTS_1="8" EVENTS_2="3" LOG_CI_END="0.8247585641070398" LOG_CI_START="-0.05560646987892757" LOG_EFFECT_SIZE="0.38457604711405613" MODIFIED="2012-04-11 19:34:37 +0100" MODIFIED_BY="Xavier L Griffin" ORDER="5" O_E="0.0" SE="0.5171307788405584" STUDY_ID="STD-Dallari-2007" TOTAL_1="11" TOTAL_2="10" VAR="0.26742424242424245" WEIGHT="0.0">
<FOOTNOTE>Data from Analysis 1.2.3 assuming non-union for randomised but unreported participants</FOOTNOTE>
</DICH_DATA>

The problem is that the <DICH_DATA line does not contain the closing /> part which regular lines have, e.g.,

<DICH_DATA CI_END="0.0" CI_START="0.0" EFFECT_SIZE="0.0" ESTIMABLE="NO" EVENTS_1="0" EVENTS_2="0" LOG_CI_END="0.0" LOG_CI_START="0.0" LOG_EFFECT_SIZE="0.0" MODIFIED="2012-01-18 11:17:33 +0000" MODIFIED_BY="Xavier L Griffin" ORDER="16" O_E="0.0" SE="0.0" STUDY_ID="STD-Dallari-2007" TOTAL_1="9" TOTAL_2="9" VAR="0.0" WEIGHT="0.0"/>

I could have a closer look if you send me your RM5-file in a personal email.

Best wishes, Guido

FBartos commented 3 years ago

Hi Guido, thank's for the reply. It seems like I didn't come across data with comments. I added a small change that should take care of it as well:

          sel.data1 <- grep(paste0("<", outcome.type, "_DATA"), txt.jk)
          if (any(grepl("<FOOTNOTE>", txt.jk)))
            sel.data2 <- grep(paste0("</", outcome.type, "_DATA"), txt.jk)
          else
            sel.data2 <- grep("/>", txt.jk)
          if (length(sel.data1) != length(sel.data2))
            stop("Malformed XML file (tag: ", outcome.type, "_DATA)")

Best, Frantisek