courtsbr / esaj

Scrapers for many e-SAJ systems
http://courtsbr.github.io/esaj/
GNU General Public License v2.0
46 stars 20 forks source link

Baixa taxa de sucesso ao usar o download_decision() #25

Open lagolucas opened 6 years ago

lagolucas commented 6 years ago

Fiz com apenas 200 decisões, mas no código original acabo utilizando cerca de 3000.

A questão é que a taxa de acerto dos downloads diminuiu muito desde a última vez que utilizei. Antes a taxa era de mais de 50% eu imagino, agora creio que ficou abaixo de 1%.

Código que eu usei para testar está aqui, o resultado foram 4 decisões baixadas de 200, qualquer sugestão para alterar o código é bem vinda:

teste <- c("11617159" ,"11616826", "11614197", "11613204", "11612465", "11612382",
"11610790", "11596184", "11585837", "11579077", "11579466", "11577358", "11572937",
"11573007", "11574174", "11568555", "11565531", "11564407", "11556810", "11549104",
"11545315", "11537176", "11537950", "11537951", "11526058", "11516110", "11511079", 
"11513449", "11508610", "11508568", "11503389", "11498955", "11489817", "11485411",
"11479724", "11480155", "11460948", "11462275", "11448583","11445232", "11441593",
"11440951", "11438440", "11436243", "11423825", "11421460", "11416047", "11409629",
"11403745", "11403744", "11406635", "11403663", "11403377", "11403630", "11398831",
"11393143", "11385625", "11381633", "11379419", "11378051", "11369112", "11365960",
"11367136", "11365345", "11365327", "11359719", "11360049", "11357297", "11353364",
"11353520", "11354949", "11343282", "11340297", "11340155", "11334865", "11329995", 
"11327349", "11323885", "11323884", "11317873", "11317179", "11315337", "11315241", 
"11315030", "11311922", "11310228", "11309236", "11308380", "11297992", "11299057", 
"11297076", "11296682", "11295834", "11295887", "11293266", "11291271", "11290014", 
"11288893", "11286811", "11281715", "11275175", "11273259", "11266539", "11263203", 
"11261741", "11260513", "11257231", "11251051", "11243957", "11242693", "11226301", 
"11227568", "11222617", "11218730", "11214487", "11209927", "11208034", "11203952", 
"11196930", "11200642", "11189730", "11194175", "11191469", "11191379", "11186632", 
"11183063", "11179295", "11182386", "11180725", "11180854", "11179799", "11180913", 
"11172762", "11165020", "11162177", "11158851", "11157543", "11163216", "11155797", 
"11151707", "11150154", "11133850", "11131344", "11129935", "11125810", "11120858", 
"11113894", "11110115", "11107224", "11101203", "11098796", "11087477", "11087782", 
"11091022", "11094604", "11094684", "11085162", "11076721", "11076080", "11078846", 
"11080229", "11073569", "11065171", "11065617", "11062180", "11058370", "11060600", 
"11058198", "11058378", "11053116", "11050031", "11045719", "11038303", "11040633", 
"11041094", "11040996", "11038209", "11033692", "11033691", "11034738", "11033245", 
"11028949", "11031748", "11030768", "11028519", "11031738", "11019675", "11022262", 
"11025347", "11014711", "11012362", "11014089", "11016115", "11013972", "11006254", 
"11006467", "11007225", "10996923", "10998816", "10999542")
library(esaj)
esaj::download_decision(teste, "./decisoes/")
#>   [1] ""                                     
#>   [2] "/tmp/Rtmp4es8XU/decisoes/11616826.pdf"
#>   [3] "/tmp/Rtmp4es8XU/decisoes/11614197.pdf"
#>   [4] "/tmp/Rtmp4es8XU/decisoes/11613204.pdf"
#>   [5] "/tmp/Rtmp4es8XU/decisoes/11612465.pdf"
#>   [6] ""                                     
#>   [7] ""                                     
#>   [8] ""                                     
#>   [9] ""                                     
#>  [10] ""                                     
#>  [11] ""                                     
#>  [12] ""                                     
#>  [13] ""                                     
#>  [14] ""                                     
#>  [15] ""                                     
#>  [16] ""                                     
#>  [17] ""                                     
#>  [18] ""                                     
#>  [19] ""                                     
#>  [20] ""                                     
#>  [21] ""                                     
#>  [22] ""                                     
#>  [23] ""                                     
#>  [24] ""                                     
#>  [25] ""                                     
#>  [26] ""                                     
#>  [27] ""                                     
#>  [28] ""                                     
#>  [29] ""                                     
#>  [30] ""                                     
#>  [31] ""                                     
#>  [32] ""                                     
#>  [33] ""                                     
#>  [34] ""                                     
#>  [35] ""                                     
#>  [36] ""                                     
#>  [37] ""                                     
#>  [38] ""                                     
#>  [39] ""                                     
#>  [40] ""                                     
#>  [41] ""                                     
#>  [42] ""                                     
#>  [43] ""                                     
#>  [44] ""                                     
#>  [45] ""                                     
#>  [46] ""                                     
#>  [47] ""                                     
#>  [48] ""                                     
#>  [49] ""                                     
#>  [50] ""                                     
#>  [51] ""                                     
#>  [52] ""                                     
#>  [53] ""                                     
#>  [54] ""                                     
#>  [55] ""                                     
#>  [56] ""                                     
#>  [57] ""                                     
#>  [58] ""                                     
#>  [59] ""                                     
#>  [60] ""                                     
#>  [61] ""                                     
#>  [62] ""                                     
#>  [63] ""                                     
#>  [64] ""                                     
#>  [65] ""                                     
#>  [66] ""                                     
#>  [67] ""                                     
#>  [68] ""                                     
#>  [69] ""                                     
#>  [70] ""                                     
#>  [71] ""                                     
#>  [72] ""                                     
#>  [73] ""                                     
#>  [74] ""                                     
#>  [75] ""                                     
#>  [76] ""                                     
#>  [77] ""                                     
#>  [78] ""                                     
#>  [79] ""                                     
#>  [80] ""                                     
#>  [81] ""                                     
#>  [82] ""                                     
#>  [83] ""                                     
#>  [84] ""                                     
#>  [85] ""                                     
#>  [86] ""                                     
#>  [87] ""                                     
#>  [88] ""                                     
#>  [89] ""                                     
#>  [90] ""                                     
#>  [91] ""                                     
#>  [92] ""                                     
#>  [93] ""                                     
#>  [94] ""                                     
#>  [95] ""                                     
#>  [96] ""                                     
#>  [97] ""                                     
#>  [98] ""                                     
#>  [99] ""                                     
#> [100] ""                                     
#> [101] ""                                     
#> [102] ""                                     
#> [103] ""                                     
#> [104] ""                                     
#> [105] ""                                     
#> [106] ""                                     
#> [107] ""                                     
#> [108] ""                                     
#> [109] ""                                     
#> [110] ""                                     
#> [111] ""                                     
#> [112] ""                                     
#> [113] ""                                     
#> [114] ""                                     
#> [115] ""                                     
#> [116] ""                                     
#> [117] ""                                     
#> [118] ""                                     
#> [119] ""                                     
#> [120] ""                                     
#> [121] ""                                     
#> [122] ""                                     
#> [123] ""                                     
#> [124] ""                                     
#> [125] ""                                     
#> [126] ""                                     
#> [127] ""                                     
#> [128] ""                                     
#> [129] ""                                     
#> [130] ""                                     
#> [131] ""                                     
#> [132] ""                                     
#> [133] ""                                     
#> [134] ""                                     
#> [135] ""                                     
#> [136] ""                                     
#> [137] ""                                     
#> [138] ""                                     
#> [139] ""                                     
#> [140] ""                                     
#> [141] ""                                     
#> [142] ""                                     
#> [143] ""                                     
#> [144] ""                                     
#> [145] ""                                     
#> [146] ""                                     
#> [147] ""                                     
#> [148] ""                                     
#> [149] ""                                     
#> [150] ""                                     
#> [151] ""                                     
#> [152] ""                                     
#> [153] ""                                     
#> [154] ""                                     
#> [155] ""                                     
#> [156] ""                                     
#> [157] ""                                     
#> [158] ""                                     
#> [159] ""                                     
#> [160] ""                                     
#> [161] ""                                     
#> [162] ""                                     
#> [163] ""                                     
#> [164] ""                                     
#> [165] ""                                     
#> [166] ""                                     
#> [167] ""                                     
#> [168] ""                                     
#> [169] ""                                     
#> [170] ""                                     
#> [171] ""                                     
#> [172] ""                                     
#> [173] ""                                     
#> [174] ""                                     
#> [175] ""                                     
#> [176] ""                                     
#> [177] ""                                     
#> [178] ""                                     
#> [179] ""                                     
#> [180] ""                                     
#> [181] ""                                     
#> [182] ""                                     
#> [183] ""                                     
#> [184] ""                                     
#> [185] ""                                     
#> [186] ""                                     
#> [187] ""                                     
#> [188] ""                                     
#> [189] ""                                     
#> [190] ""                                     
#> [191] ""                                     
#> [192] ""                                     
#> [193] ""                                     
#> [194] ""                                     
#> [195] ""                                     
#> [196] ""                                     
#> [197] ""                                     
#> [198] ""                                     
#> [199] ""                                     
#> [200] ""

Created on 2018-07-17 by the reprex package (v0.2.0).

Session info ``` r devtools::session_info() #> Session info ------------------------------------------------------------- #> setting value #> version R version 3.4.4 (2018-03-15) #> system x86_64, linux-gnu #> ui X11 #> language (EN) #> collate C #> tz America/Sao_Paulo #> date 2018-07-17 #> Packages ----------------------------------------------------------------- #> package * version date source #> assertthat 0.2.0 2017-04-11 cran (@0.2.0) #> backports 1.1.2 2017-12-13 CRAN (R 3.4.4) #> base * 3.4.4 2018-04-21 local #> base64enc 0.1-3 2015-07-28 cran (@0.1-3) #> bindr 0.1.1 2018-03-13 CRAN (R 3.4.4) #> bindrcpp * 0.2.2 2018-03-29 CRAN (R 3.4.4) #> compiler 3.4.4 2018-04-21 local #> crayon 1.3.4 2017-09-16 cran (@1.3.4) #> curl 3.2 2018-03-28 CRAN (R 3.4.4) #> datasets * 3.4.4 2018-04-21 local #> devtools 1.13.6 2018-06-27 CRAN (R 3.4.4) #> digest 0.6.15 2018-01-28 CRAN (R 3.4.2) #> dplyr 0.7.6 2018-06-29 CRAN (R 3.4.4) #> esaj * 0.1.2.9000 2018-07-16 Github (courtsbr/esaj@2fc11fe) #> evaluate 0.10.1 2017-06-24 cran (@0.10.1) #> glue 1.2.0 2017-10-29 cran (@1.2.0) #> graphics * 3.4.4 2018-04-21 local #> grDevices * 3.4.4 2018-04-21 local #> hms 0.4.2 2018-03-10 CRAN (R 3.4.4) #> htmltools 0.3.6 2017-04-28 CRAN (R 3.4.4) #> httr 1.3.1 2017-08-20 CRAN (R 3.4.2) #> jsonlite 1.5 2017-06-01 CRAN (R 3.4.2) #> knitr 1.20 2018-02-20 CRAN (R 3.4.4) #> lubridate 1.7.4 2018-04-11 CRAN (R 3.4.4) #> magick 1.9 2018-05-11 CRAN (R 3.4.4) #> magrittr 1.5 2014-11-22 cran (@1.5) #> memoise 1.1.0 2017-04-21 CRAN (R 3.4.2) #> methods * 3.4.4 2018-04-21 local #> pillar 1.3.0 2018-07-14 CRAN (R 3.4.4) #> pkgconfig 2.0.1 2017-03-21 cran (@2.0.1) #> png 0.1-7 2013-12-03 cran (@0.1-7) #> prettyunits 1.0.2 2015-07-13 cran (@1.0.2) #> progress 1.2.0 2018-06-14 CRAN (R 3.4.4) #> purrr 0.2.5 2018-05-29 CRAN (R 3.4.4) #> R6 2.2.2 2017-06-17 CRAN (R 3.4.2) #> rappdirs 0.3.1 2016-03-28 cran (@0.3.1) #> Rcpp 0.12.17 2018-05-18 CRAN (R 3.4.4) #> rlang 0.2.1 2018-05-30 CRAN (R 3.4.4) #> rmarkdown 1.10 2018-06-11 CRAN (R 3.4.4) #> rprojroot 1.3-2 2018-01-03 CRAN (R 3.4.4) #> stats * 3.4.4 2018-04-21 local #> stringi 1.2.3 2018-06-12 CRAN (R 3.4.4) #> stringr 1.3.1 2018-05-10 CRAN (R 3.4.4) #> tesseract 2.2 2018-07-10 CRAN (R 3.4.4) #> tibble 1.4.2 2018-01-22 cran (@1.4.2) #> tidyr 0.8.1 2018-05-18 CRAN (R 3.4.4) #> tidyselect 0.2.4 2018-02-26 CRAN (R 3.4.4) #> tools 3.4.4 2018-04-21 local #> utils * 3.4.4 2018-04-21 local #> withr 2.1.2 2018-03-15 CRAN (R 3.4.4) #> yaml 2.1.19 2018-05-01 CRAN (R 3.4.4) ```
lagolucas commented 6 years ago

Alterando o código acima para:

for (decisao in teste) {
  Sys.sleep(3)
  esaj:::download_decision(decision = decisao, path = "Desktop/", ntry = 1)
}

A taxa de acerto aumentou bastante, o que me sugere ser alguma coisa com as tentativas repetidas que estava sendo bloqueado de alguma forma.

mpena2099 commented 5 years ago

@lagolucas, comigo nem esse seu ultimo exemplo funcionou. Nao consigo baixar NENHUM desses 200 IDs que voce indicou.

Atualmente esse código ainda funciona pra voce? Ah!, eu tive que alterar o "esaj:::download_decision" para "esaj:::downloaddecision" para conseguir passar "ntry = 1" por parâmetro.