arunkannawadi / independent-citation-counter

A collection of Jupyter notebooks that count independent citations from different bibliographic databases
11 stars 0 forks source link

debug independent cited paper give error -99 #2

Open yinshiyi opened 1 year ago

yinshiyi commented 1 year ago

scholar_id = 'kdsUmJkAAAAJ'

https://scholar.google.com/scholar?hl=en&cites=14870815335955536765&hl=en&scipsc=1&q=-author:%27M%20NandyMazumdar%27+-author:%27A%20Paranjapye%27+-author:%27J%20Browne%27+-author:%27S%20Yin%27+-author:%27S%20Leir%27+-author:%27A%20Harris%27

at the time of writing, this link has 7 entries. manual curation is 7/8

the colab code is outputing -99/8

    if links_only:
        return None

    try:
        search_results = scholarly.search_pubs_custom_url(independent_url)
        num_independent_citations = search_results.total_results if search_results.total_results else 0
    except Exception as err:
        num_independent_citations = -99

Would appreciate it a lot if you could help me debug the issue here. Thank you

arunkannawadi commented 1 year ago

To get the numbers, you need to set proxy_type to FreeProxies or any of the other methods mentioned in the description in the notebook. Using FreeProxies will make the code run slower and success is not guaranteed either. This is because Google Scholar actively tries to block any programmatic retrieval of those pages. If this is the only query, or if you have a small number of queries, you could set links_only variable to False and it'll retrieve the number. If you are planning on using this regularly or for a large number of queries (>5 papers say), then you're risking being flagged by Google Scholar. In that case, you're better off clicking the links yourself and read off the number from the page.

yinshiyi commented 1 year ago

Thank you for your comment, I did used FreeProxies. The results I generated is from FreeProxies. This repo worked great for some articles, but I discovered a bug. My error is where 7 independent citation is found, the colab output error -99, I tried to debug by looking at the try: except portion of the code, but didnt figure out why. If you could guide me on the debuggging process, that would be great, thank you