fsingletonthorn EffectSizeScraping issues

fsingletonthorn / EffectSizeScraping

MIT License

1 stars 0 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Definition of time variable extracted as t-test result

#67 Lucy928 opened 4 years ago
0
PMC ID: 4404462 errors

#66 fsingletonthorn closed 4 years ago
0
extract aberrantly reported tests such as extractTTests("t29 = 2.202")

#65 fsingletonthorn closed 4 years ago
1
Detect chi squares reported as χ(2)2=22.58,p < 0.001

#64 fsingletonthorn closed 4 years ago
0
P value regex can sometimes pull out p values > 1

#63 fsingletonthorn closed 4 years ago
1
Refactor regexs for all test statistics and effect sizes with known acceptable formats

#62 fsingletonthorn closed 4 years ago
2
Correlation coeffcients reported in words sometimes picked up with value = "0"

#61 fsingletonthorn closed 4 years ago
1
Partial eta squares are picked up with "p = 2" all the time .

#60 fsingletonthorn closed 4 years ago
0
Errors when scraping validation set

#59 fsingletonthorn closed 4 years ago
3
Create "multiple non significant results" detector

#58 fsingletonthorn opened 4 years ago
0
Write tests for extracting editors info

#57 fsingletonthorn closed 4 years ago
0
accept semicolons and ,'s in statistical tests e.g., F1,2 = 0,123; p = 0,84)

#56 fsingletonthorn closed 5 years ago
1
Pilot on 200 articles included in the extraction test section

#55 fsingletonthorn closed 4 years ago
1
Makes sure that the "F = 4.96, df = 22, p = 0.001" way of reporting all statistical tests will be accepted

#54 fsingletonthorn closed 5 years ago
2
Accept : instead of "=" in search and effect size regexs

#53 fsingletonthorn closed 5 years ago
0
Accept X as a chi in chi square test - folks seem to be doing this

#52 fsingletonthorn closed 5 years ago
0
Convert ρ to r in read out

#51 fsingletonthorn closed 5 years ago
0
Do reliability check on statistical test results

#50 fsingletonthorn closed 4 years ago
2
Remove "measures" and "screened"

#49 fsingletonthorn closed 5 years ago
1
Issues with sample size extraction

#48 fsingletonthorn closed 5 years ago
0
Statcheck output is not tidy

#47 fsingletonthorn closed 5 years ago
0
Run statcheck on all files - currently does not pass tests

#46 fsingletonthorn closed 5 years ago
0
Newlines sometimes included in keywords

#45 fsingletonthorn closed 5 years ago
1
"R > 1" is captured as a correlation

#44 fsingletonthorn closed 5 years ago
1
Breaks when PDF is not machine readable - related to #18

#43 fsingletonthorn closed 5 years ago
1
Ensure that failed PDF downloads throw an error / the error is recorded and available somehow

#42 fsingletonthorn opened 5 years ago
0
sample size detector doesn't pick up when numbers are broken by commas e.g., "30,000"

#41 fsingletonthorn closed 5 years ago
0
To do - go through and test on 25 manually selected articles before the formal test set is developed

#40 fsingletonthorn closed 5 years ago
3
To do - go through and test on 10 manually selected articles before the formal test set is developed

#39 fsingletonthorn closed 5 years ago
1
Extract CI values along with words that triggered CI extractor

#38 fsingletonthorn opened 5 years ago
0
rewrite processPMC so that it doesn't have to copy and output the data but rather just return the test statistics

#37 fsingletonthorn closed 5 years ago
0
Add in the search functions to scrapePMC

#36 fsingletonthorn closed 4 years ago
1
Add "took part" to participant extraction list

#35 fsingletonthorn closed 5 years ago
0
Maybe estimate number of studies in each paper?

#34 fsingletonthorn opened 5 years ago
0
Chi square output can fail to extract dfs e.g., "<U+03C7>df2=4=4.541"

#33 fsingletonthorn closed 5 years ago
0
Reformat $text output to tidy

#32 fsingletonthorn closed 5 years ago
0
Maybe also build in a reliability check detector?

#31 fsingletonthorn opened 5 years ago
0
Probably also extract p values on their own?

#30 fsingletonthorn closed 5 years ago
0
Incorrect splitting when hundreds are used as multipliers for higher order magnitudes e.g., "One hundred and two thousand"

#29 fsingletonthorn closed 5 years ago
2
Word_to_numbers sometimes leads to words being replaced with 0

#28 fsingletonthorn closed 5 years ago
1
Write function to replace numbers with words during full text cleaning

#27 fsingletonthorn closed 5 years ago
2
Replace home spun text cleaning functions with replace_non_ascii() from text clean package for consistency?

#26 fsingletonthorn closed 5 years ago
2
Include P value detector and extractor

#25 fsingletonthorn opened 5 years ago
1
Scanned PDFs often read 1 as a lower case “L”, breading my read function

#24 fsingletonthorn closed 5 years ago
1
html tag removal deletes portions of text contained within < > even when not a html tag

#23 fsingletonthorn closed 5 years ago
1
chi square extraction can accidentally extract the df and not the test stat

#22 fsingletonthorn closed 5 years ago
0
Article PFDs with keywords / etc beside the abstract can lead to the abstract being placed within the introduction text

#21 fsingletonthorn opened 5 years ago
0
Look for PDF when XML doesn't get returned from PubMed

#20 fsingletonthorn closed 5 years ago
3
Sample size extraction

#19 fsingletonthorn closed 5 years ago
1
build in OCR using the Tesseract OCR engine when PDF extraction fails to extract any text

#18 fsingletonthorn opened 5 years ago
1