Leszek-Sieminski / screamingFrogR

R integration with Screaming Frog CLI
Other
26 stars 3 forks source link

--export-tabs insert unwanted commas for arguments containing spaces #3

Open mxcrml opened 2 years ago

mxcrml commented 2 years ago

Hi,

I struggle launching sfr_crawl when using export tabs arguments with space such as "Response Codes", "Page Titles", etc. It seems that the space results somewhere as a break (lines 3-4) which then turns into a comma (line 14), rendering the string not readable by Screaming Frog (line 17)

1- 2022-07-31 18:08:58,130 [15212] [main] INFO  - argument 7 = '--save-crawl'
2- 2022-07-31 18:08:58,130 [15212] [main] INFO  - argument 8 = '--export-tabs'
3- 2022-07-31 18:08:58,130 [15212] [main] INFO  - argument 9 = 'Internal:All,External:All,Security:All,Response'
4- 2022-07-31 18:08:58,130 [15212] [main] INFO  - argument 10 = 'Codes:All'**
5- 2022-07-31 18:08:58,130 [15212] [main] INFO  - argument 11 = '--headless'
6- 2022-07-31 18:08:58,130 [15212] [main] INFO  - argument 12 = '--config'
7- 2022-07-31 18:08:58,131 [15212] [main] INFO  - argument 13 = 'C:/Users/xxx/Desktop/ScreamingFrog-Automation/seoconfig.seospiderconfig'
8- 2022-07-31 18:08:58,133 [15212] [main] INFO  - Parsed arguments:
9- 2022-07-31 18:08:58,140 [15212] [main] INFO  - --crawl https://www.example.com
10- 2022-07-31 18:08:58,140 [15212] [main] INFO  - --output-folder C:/Users/xxx/Desktop/ScreamingFrog-Automation/
11- 2022-07-31 18:08:58,141 [15212] [main] INFO  - --timestamped-output 
12- 2022-07-31 18:08:58,141 [15212] [main] INFO  - --export-format csv
13- 2022-07-31 18:08:58,141 [15212] [main] INFO  - --save-crawl 
14- 2022-07-31 18:08:58,142 [15212] [main] INFO  - --export-tabs Internal:All,External:All,Security:All,Response,Codes:All
15- 2022-07-31 18:08:58,142 [15212] [main] INFO  - --headless 
16- 2022-07-31 18:08:58,142 [15212] [main] INFO  - --config C:/Users/xxx/Desktop/ScreamingFrog-Automation/seoconfig.seospiderconfig
17- 2022-07-31 18:08:58,143 [15212] [main] FATAL - --export-tabs arguments must be in the format Tab:Filter

Do you have any idea on how to fix this ? I think we would need to replace space by some kind of string but I have really no clue.

Thanks.

Leszek-Sieminski commented 2 years ago

Hi Maxime!

It looks like a bug with name parsing indeed but can you provide me with:

so it's easier to debug? Thanks in advance!

Best Leszek

mxcrml commented 2 years ago

Hi Leszek ! Thank you for your reactivity :)

Here are all the info you require :

Screaming Frog SEO Spider : Version 16.7 ScreamingFrogR Version 0.1.1 R version 4.2.0 Patched (2022-05-17 r82376 ucrt) RStudio 2022.07.1 Build 554

R code :

#------------------------------------------------------------------------#
#                              INITIALIZATION                            #
#------------------------------------------------------------------------#
# Load Packages (require the 'pacman' package)
# install.packages("pacman")
# devtools::install_github("Leszek-Sieminski/screamingFrogR")
# library("screamingFrogR")

pacman::p_load(xlsx,here,screamingFrogR)
# setup --------------------------------------------------------
screamingFrogR::sfr_setup_windows(path = "C:/Program Files (x86)/Screaming Frog SEO Spider/")
myurl<-"https://www.astralrank.com"
myconfig<-here("seoconfig.seospiderconfig")
mytabs <- c("Internal:All","External:All","Security:All","Response Codes:All")

# running a crawl -------------------------------------------------------------
screamingFrogR::sfr_crawl(
  url = myurl,
  save_crawl_file = TRUE,
  export_tabs = mytabs,
  timestamped_output = TRUE,
  config = myconfig
)

Session info :

R version 4.2.0 Patched (2022-05-17 r82376 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 22000)

Matrix products: default

locale:
[1] LC_COLLATE=French_France.utf8  LC_CTYPE=French_France.utf8   
[3] LC_MONETARY=French_France.utf8 LC_NUMERIC=C                  
[5] LC_TIME=French_France.utf8    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] screamingFrogR_0.1.1 here_1.0.1           xlsx_0.6.5          

loaded via a namespace (and not attached):
 [1] compiler_4.2.0    assertthat_0.2.1  rprojroot_2.0.3   tools_4.2.0      
 [5] glue_1.6.2        cellranger_1.1.0  readxl_1.4.0      data.table_1.14.2
 [9] xlsxjars_0.6.1    rJava_1.0-6       pacman_0.5.1  

I hope this will help to reproduce (at least) and solve this issue ! Let me know :)

Best, Maxime

mxcrml commented 2 years ago

Hi Leszek ! Hope you're doing well.

Any new idea on how to solve this bug ? Let me know if I can help.

Best Maxime

Leszek-Sieminski commented 2 years ago

Hello Maxime!

I'm sorry for the delay. I was seriously sick for almost a month and couldn't do it and after that I simply lost track about it. I will plan so I am able to fix this next week.

Best, Leszek

mxcrml commented 2 years ago

Hi Leszek !

Wow... very sorry to learn that, I wish you a speedy recovery and hope you're doing well (or at least ok) now. No worries at all for the delay, it's all about sharing in here (at least for me).

Have a very good day.

Best, Maxime

mxcrml commented 1 year ago

Hi Leszek,

How are you ? Hope you're doing well :) By any chance, could you spend some time to solve this issue ? I can help fix this with little guidance :)

Best, Maxime

Leszek-Sieminski commented 1 year ago

Hi Maxime! If you could guide me this would be wonderful :) if you want to you can also fork, fix it and send me a pull request but if not I will be thankful for any support