issues
search
fsingletonthorn
/
EffectSizeScraping
MIT License
1
stars
0
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Definition of time variable extracted as t-test result
#67
Lucy928
opened
4 years ago
0
PMC ID: 4404462 errors
#66
fsingletonthorn
closed
4 years ago
0
extract aberrantly reported tests such as extractTTests("t29 = 2.202")
#65
fsingletonthorn
closed
4 years ago
1
Detect chi squares reported as χ(2)2=22.58,p < 0.001
#64
fsingletonthorn
closed
4 years ago
0
P value regex can sometimes pull out p values > 1
#63
fsingletonthorn
closed
4 years ago
1
Refactor regexs for all test statistics and effect sizes with known acceptable formats
#62
fsingletonthorn
closed
4 years ago
2
Correlation coeffcients reported in words sometimes picked up with value = "0"
#61
fsingletonthorn
closed
4 years ago
1
Partial eta squares are picked up with "p = 2" all the time .
#60
fsingletonthorn
closed
4 years ago
0
Errors when scraping validation set
#59
fsingletonthorn
closed
4 years ago
3
Create "multiple non significant results" detector
#58
fsingletonthorn
opened
4 years ago
0
Write tests for extracting editors info
#57
fsingletonthorn
closed
4 years ago
0
accept semicolons and ,'s in statistical tests e.g., F1,2 = 0,123; p = 0,84)
#56
fsingletonthorn
closed
5 years ago
1
Pilot on 200 articles included in the extraction test section
#55
fsingletonthorn
closed
4 years ago
1
Makes sure that the "F = 4.96, df = 22, p = 0.001" way of reporting all statistical tests will be accepted
#54
fsingletonthorn
closed
5 years ago
2
Accept : instead of "=" in search and effect size regexs
#53
fsingletonthorn
closed
5 years ago
0
Accept X as a chi in chi square test - folks seem to be doing this
#52
fsingletonthorn
closed
5 years ago
0
Convert ρ to r in read out
#51
fsingletonthorn
closed
5 years ago
0
Do reliability check on statistical test results
#50
fsingletonthorn
closed
4 years ago
2
Remove "measures" and "screened"
#49
fsingletonthorn
closed
5 years ago
1
Issues with sample size extraction
#48
fsingletonthorn
closed
5 years ago
0
Statcheck output is not tidy
#47
fsingletonthorn
closed
5 years ago
0
Run statcheck on all files - currently does not pass tests
#46
fsingletonthorn
closed
5 years ago
0
Newlines sometimes included in keywords
#45
fsingletonthorn
closed
5 years ago
1
"R > 1" is captured as a correlation
#44
fsingletonthorn
closed
5 years ago
1
Breaks when PDF is not machine readable - related to #18
#43
fsingletonthorn
closed
5 years ago
1
Ensure that failed PDF downloads throw an error / the error is recorded and available somehow
#42
fsingletonthorn
opened
5 years ago
0
sample size detector doesn't pick up when numbers are broken by commas e.g., "30,000"
#41
fsingletonthorn
closed
5 years ago
0
To do - go through and test on 25 manually selected articles before the formal test set is developed
#40
fsingletonthorn
closed
5 years ago
3
To do - go through and test on 10 manually selected articles before the formal test set is developed
#39
fsingletonthorn
closed
5 years ago
1
Extract CI values along with words that triggered CI extractor
#38
fsingletonthorn
opened
5 years ago
0
rewrite processPMC so that it doesn't have to copy and output the data but rather just return the test statistics
#37
fsingletonthorn
closed
5 years ago
0
Add in the search functions to scrapePMC
#36
fsingletonthorn
closed
4 years ago
1
Add "took part" to participant extraction list
#35
fsingletonthorn
closed
5 years ago
0
Maybe estimate number of studies in each paper?
#34
fsingletonthorn
opened
5 years ago
0
Chi square output can fail to extract dfs e.g., "<U+03C7>df2=4=4.541"
#33
fsingletonthorn
closed
5 years ago
0
Reformat $text output to tidy
#32
fsingletonthorn
closed
5 years ago
0
Maybe also build in a reliability check detector?
#31
fsingletonthorn
opened
5 years ago
0
Probably also extract p values on their own?
#30
fsingletonthorn
closed
5 years ago
0
Incorrect splitting when hundreds are used as multipliers for higher order magnitudes e.g., "One hundred and two thousand"
#29
fsingletonthorn
closed
5 years ago
2
Word_to_numbers sometimes leads to words being replaced with 0
#28
fsingletonthorn
closed
5 years ago
1
Write function to replace numbers with words during full text cleaning
#27
fsingletonthorn
closed
5 years ago
2
Replace home spun text cleaning functions with replace_non_ascii() from text clean package for consistency?
#26
fsingletonthorn
closed
5 years ago
2
Include P value detector and extractor
#25
fsingletonthorn
opened
5 years ago
1
Scanned PDFs often read 1 as a lower case “L”, breading my read function
#24
fsingletonthorn
closed
5 years ago
1
html tag removal deletes portions of text contained within < > even when not a html tag
#23
fsingletonthorn
closed
5 years ago
1
chi square extraction can accidentally extract the df and not the test stat
#22
fsingletonthorn
closed
5 years ago
0
Article PFDs with keywords / etc beside the abstract can lead to the abstract being placed within the introduction text
#21
fsingletonthorn
opened
5 years ago
0
Look for PDF when XML doesn't get returned from PubMed
#20
fsingletonthorn
closed
5 years ago
3
Sample size extraction
#19
fsingletonthorn
closed
5 years ago
1
build in OCR using the Tesseract OCR engine when PDF extraction fails to extract any text
#18
fsingletonthorn
opened
5 years ago
1
Next