Closed genomematt closed 6 years ago
JOSS papers have included the Google Scholar metadata tags since this update about 10 days ago.
Their docs suggest it can take 4-6 weeks to be indexed so I think we're still in that window. Let's leave this open for now to keep tracking this issue. I agree we must be indexed by Google Scholar, I'm just not sure there's anything we need to do other than wait at this point 🕐
Excellent!
At least a couple papers show up now: https://scholar.google.com/scholar?hl=en&q=GeneNetwork%3A+framework+for+web-based+genetics https://scholar.google.com/scholar?hl=en&q=R3D2%3A+Relativistic+Reactive+Riemann+problem+solver+for+Deflagrations+and+Detonations
However other articles on the same page currently do not: https://scholar.google.com/scholar?q=Xenomapper%3A+Mapping+reads+in+a+mixed+species+context https://scholar.google.com/scholar?q=pyuca%3A+a+Python+implementation+of+the+Unicode+Collation+Algorithm
Pretty cool that some are showing anyway and maybe google's getting around to processing the others.
@sherrillmix, thanks for finding these. Let's keep monitoring this.
It looks like Google Scholar is not actually indexing anything from JOSS. The two examples listed by @sherrillmix only show up in Google Scholar because the first is also listed at researchgate.net and the second is listed at eprints.soton.ac.uk. Google Scholar is indexing them based off those domains, not because of anything going on at JOSS.
Maybe Google Scholar isn't crawling through the Github links? My JOSS paper is indexed, but that's from being uploaded to my own website (https://scholar.google.com/citations?view_op=view_citation&citation_for_view=2QJwoAwAAAAJ:3BvdIg-l-ZAC), but I feel like that's still sufficiently different than being on researchgate or a university's own repository.
Maybe it is an idea to move the static pages off github anyway. I know people who do not want to publish with JOSS because of the tight github connection. I think it will be fine to use the issue tracker etc., but at least it won't look like JOSS being a github subsidiary.
Github pages is more integrated now (no longer requires the gh-pages branch). It might be easier to switch to that now, rather than having the paper PDFs being loaded through the Github file preview.
OK, so I actually just submitted JOSS to Google Scholar page for requesting indexing of a journal. This is the (automated) response I got:
Homepage: http://joss.theoj.org Contact name: Kyle Niemeyer Contact email: kyle.niemeyer@gmail.com Inclusion type: Other journal website Inclusion size: 51-100 Volume URLs: http://joss.theoj.org/papers/popular Issue URLs: http://joss.theoj.org/papers/popular TOC URLs: http://joss.theoj.org/papers/popular Abstract URLs: http://joss.theoj.org/papers/10.21105/joss.00194 http://joss.theoj.org/papers/10.21105/joss.00011 http://joss.theoj.org/papers/10.21105/joss.00189 Article URLs: https://github.com/openjournals/joss-papers/blob/master/joss.00194/10.21105.joss.00194.pdf https://github.com/openjournals/joss-papers/blob/master/joss.00011/10.21105.joss.00011.pdf https://github.com/openjournals/joss-papers/blob/master/joss.00189/10.21105.joss.00189.pdf https://github.com/openjournals/joss-papers/blob/master/joss.00012/10.21105.joss.00012.pdf https://github.com/openjournals/joss-papers/blob/master/joss.00016/10.21105.joss.00016.pdf If your content meets our guidelines, you can generally expect to find it included within the Google Scholar results within 4-6 weeks.
Please keep in mind that bibliographic data is extracted from your pages by automatic software. If you aren’t satisfied with the accuracy of your listings, please refer to our technical guidelines at http://scholar.google.com/intl/en/scholar/inclusion.html for ways to provide more accurate bibliographic data.
Regards,
The Google Scholar team
I agree that having the paper PDFs linked to directly rather than the GitHub file preview may be smart—not sure if that will matter for Google Scholar.
I agree that having the paper PDFs linked to directly rather than the GitHub file preview may be smart—not sure if that will matter for Google Scholar.
Just incase that's an issue I've updated the URLs on the site to link to the 'raw' GitHub URLs which means they don't display in the GitHub UI. An example of this is:
I used to have the same problem a couple years ago when I put reprints of my papers into a github repository. I waited more than a year and it was still not in Google Scholar. Then I moved PDFs to a repository served through github pages -- and this helped.
I cannot be sure what the issue was, but perhaps it's because PDFs from github.com/.../raw/master/... are served as Content-Type "application/octet-stream" instead of "application/pdf".
Additional benefit of serving PDFs through github pages would be that the URL would look better, e.g. https://openjournals.github.io/joss-articles/10.21105.joss.00194.pdf
I used to have the same problem a couple years ago when I put reprints of my papers into a github repository. I waited more than a year and it was still not in Google Scholar. Then I moved PDFs to a repository served through github pages -- and this helped.
👍 thanks @wojdyr, that's very helpful. Good point about the application/octet-stream
content type possibly upsetting the Google bot.
👍 thanks @wojdyr, that's very helpful. Good point about the application/octet-stream content type possibly upsetting the Google bot.
OK in https://github.com/openjournals/whedon/pull/11 I've modified the URLs we're serving to e.g. http://www.theoj.org/joss-papers/joss.00411/10.21105.joss.00411.pdf . Fingers-crossed that helps!
My paper is now indexed on Google Scholar! Others should check as well. That last modification might have done the trick.
Which paper is that @FaustinCarter?
Actually, my excitement may have been premature. It looks like it may only be indexed because it was listed as a citation here (thanks to the kind soul who cited it in a more traditional publication): http://adsabs.harvard.edu//abs/2016JOSS.2016...46B.
The JOSS link is: http://joss.theoj.org/papers/10.21105/joss.00046
If you search for "pygtc" on http://scholar.google.com it shows up as the first link, but with a [CITATION] tag preceding the title. I think this means that it is grabbing it from the adsabs rather than indexing it directly. This is further motivated by the fact that on both Google Scholar and the Harvard Adsabs service the abstract is listed as "Not available".
Bummer.
I believe this is now fixed.
See https://scholar.google.com/scholar?hl=en&as_sdt=0%2C9&q=10.21105&btnG= for example.
Still a little confused as to why my paper isn't being listed. @arfon is there any reason you can think of?
@Benjamin-Lee - not sure. I'm following up with some folks about this.
Another paper that hasn't made it is: https://joss.theoj.org/papers/cf6f8ac309d6a18b6d6cf08b64aa3f62
@FaustinCarter - yes, it looks like something stopped working in early August this year.
@arfon This https://joss.theoj.org/papers/0c6638f84a1a574913ed7c6dd1051847 paper was indexed, but the date was not (yet) extracted. The format of the JOSS papers does not meet the specifications that google scholar used to have. The specifications have changed a little but, currently, the date in joss papers is not as suggested at https://scholar.google.com/intl/en/scholar/inclusion.html#indexing section 2.a.C. This issue was closed some time ago, should we reopen a new issue?
This issue was closed some time ago, should we reopen a new issue?
AFAIK, Google Scholar doesn't index us directly, rather, our papers are picked up via ADS. I'll follow up with the folks at ADS to see if there's something different we should be doing.
I was going to create a new issue for this, but seeing as there seems to be recent discussion on this thread, I would like to also mention that our article does not seem to be picked up by Google Scholar.
I don't know if this information is helpful, but the last article that seems to be picked up by theoj.org is this one (50 days ago): https://www.theoj.org/joss-papers/joss.01102/10.21105.joss.01102.pdf.
The last one picked up by ads is this one (147 days ago): http://adsabs.harvard.edu/abs/2018JOSS....3..854M
It seems indexing via google scholar seems to have stopped roughly 50 days ago, but there doesn't seem to be any PR that should affect this around that time. I suppose a solution for now would be to upload my article to an institutional repository of my university?
@arfon
I wonder if this might be the problem:
Published JOSS paper pages now seem to leave the Google Scholar tag "citation_author" empty:
<meta name="citation_author" content="">
.
Example: https://joss.theoj.org/papers/10.21105/joss.01342
Google Scholar (https://scholar.google.com/intl/en/scholar/inclusion.html#indexing) states that:
At least one author tag is required for inclusion in Google Scholar.
The time frame of the problem, as stated by you, supports this hypothesis:
@FaustinCarter - yes, it looks like something stopped working in early August this year.
Paper from 1 September 2018 with no author tag: https://joss.theoj.org/papers/efb9242db91adee8c8265f000f26ef5a
Paper from 29 June 2018 with author tag: https://joss.theoj.org/papers/049f6d3dab9391e8353484028148dd0d
The
<meta name="citation_author" content="">
tag (denoted by empty) is not enough to explain the missing indexing (). I checked some of the papers below, and it doesn't explain why some are listed from JOSS
However, there's definitely still issue with Google Scholar indexing. Maybe this issue should be reopened? @arfon
From 1 month ago:
Also 2 months:
Also 3 months ago:
Also 6 months ago:
Hi all, thanks for digging into this further. A couple of things:
<meta name="citation_author" content="">
tag is now populated for all paper.I'm not sure what more to do at this point but am open to suggestions/improvements.
Is there any update on this, e.g., is there a way to manually update the indexing? (Specifically asking for https://joss.theoj.org/papers/10.21105/joss.01636) As a side note I noticed that the "Copy bibtex button" doesn't seem to get updated automatically.
I looked up that paper on Google Scholar, using the title, and in this case it is being indexed through adsabs.harvard.edu
.
The citation info in Harvard style is given as:
Brummel-Smith, C., Bryan, G., Butsky, I., Corlies, L., Emerick, A., Forbes, J., Fujimoto, Y., Goldbaum, N., Grete, P., Hummels, C. and Kim, J.H., 2019. ENZO: An Adaptive Mesh Refinement Code for Astrophysics (Version 2.6). The Journal of Open Source Software, 4.
and the BibTeX is given as:
@article{brummel2019enzo,
title={ENZO: An Adaptive Mesh Refinement Code for Astrophysics (Version 2.6)},
author={Brummel-Smith, Corey and Bryan, Greg and Butsky, Iryna and Corlies, Lauren and Emerick, Andrew and Forbes, John and Fujimoto, Yusuke and Goldbaum, Nathan and Grete, Philipp and Hummels, Cameron and others},
journal={The Journal of Open Source Software},
volume={4},
year={2019}
}
... which all seems OK. What is your concern about the indexing of this article?
I see. Looks like I got confused/missed it because neither authors nor references are parsed from ADS. Regarding bibtex, when I click the button the content that ends up in my clipboard is "BibTex entry not available. Please check back later."
Oh, I was grabbing the BibTeX info from the Google Scholar "Cite" dialog, not the JOSS website.
Regarding bibtex, when I click the button the content that ends up in my clipboard is "BibTex entry not available. Please check back later."
Yeah, this is super-buggy and I've just removed it from the UI for now until we can find a long-term fix.
Any recommendations/updates in 2022?
Any recommendations/updates in 2022?
On what in particular sorry?
@arfon, sorry - for ensuring that a particular JOSS article gets indexed on Google Scholar for all co-authors.
@arfon, sorry - for ensuring that a particular JOSS article gets indexed on Google Scholar for all co-authors.
Are you having issues with one of your papers being indexed? If so, could you describe the issue in more detail please?
@arfon looks like it went through OK. Just required some more patience on my part. Thanks!
Great stuff!
Now that this was brought up again, I still see issues with GS correctly identifying citations.
For instance, this article cites my work here, but the citation is not listed in GS (but is listed in e.g. Dimensions).
Have others experienced similar things, or could this be caused by the line break in the journal name? (I can open up a separate issue if that's helpful)
Not sure sorry. Google Scholar is a bit of a black box to us all.
Hi @arfon ! My article was published in JOSS on Oct 23 but google scholar is still not seeing it. Is there anything I can do? It appears there is not much we can do based on the above comments but just wanted to double check given the time passed since the last comment :)
I'm afraid not @ManavalanG . We don't have any visibility into the Google Scholar operations.
Articles are being added: https://scholar.google.com/scholar?hl=de&as_sdt=0%2C5&q=source%3Ajoss+OR+source%3A%22Journal+of+Open+Source+Software%22&btnG=
But it takes a long while.
@hauschke My article was published on Oct 23 and is yet to be seen by google scholar. However when I checked multiple articles that were published in JOSS since then, all of them showed up in google scholar including an article published on Nov 11. This made me curious :)
Hi! I just wanted to note that our article published on Oct 2023 is yet to appear in google scholar 😞 I checked if there is a mechanism on my end to make it available via google scholar, but I didn't find any. I gather from the earlier conversation that JOSS admins can't do much about it, but I wanted to register my frustration here 😢
Hi all, just wanted to chime in that our paper: https://joss.theoj.org/papers/10.21105/joss.05566 is appearing in the Google Scholar feed of @matsvanes , but is not appearing in my own feed or that of the other co-authors (@robertoostenveld @schoffelen).
Is there anything that can be done? Perhaps metadata are OK for the first author (in whose feed it is appearing) but not for the co-authors?
My paper has finally made it to google scholar a month or so ago, but it still doesn't appear in google scholar feed of mine or other authors. Looks like this is similar to the issue mentioned by @Spaak.
@Spaak I actually added it to my Google Scholar manually. I saw they're trying to get JOSS articles indexed in PubMed automatically, but I don't know if any work is being done for Google Scholar?
Confirming again that all criteria listed in google scholar's content, crawling, and indexing requirements are met: https://scholar.google.com/intl/en/scholar/inclusion.html
The only thing I could think of is to make a special page with all the papers from the last two weeks (the current "most recent papers" feed only lists from ~6 days ago, depending on volume), and they say:
For websites with more than a hundred thousand papers, we recommend that you create an additional browse interface that lists only the articles added in the last two weeks. This smaller set of webpages can be recrawled more frequently than your entire browse interface, which will facilitate timely coverage of your recent papers by the search robots.
not sure if that would help but it's the only thing they recommend that we don't do.
It does not at the moment look like JOSS is being indexed by google scholar. I think this is one of the sites we want to ensure visibility for the journal on.
https://scholar.google.com.au/intl/en/scholar/inclusion.html#troubleshooting