manubot / rootstock

Clone me to create your Manubot manuscript
https://manubot.github.io/rootstock/
Other
453 stars 177 forks source link

Tracking Manubot usage #67

Closed agitter closed 5 years ago

agitter commented 7 years ago

It may be nice in the future to produce statistics about how many documents have been authored with Manubot and this rootstock or refer to more examples. @dhimmel has https://github.com/dhimmel/rephetio-manuscript/ and were examples listed in #62.

I haven't been able to think of a non-invasive way to track this. Does anyone else have ideas? Is this worthwhile?

dhimmel commented 7 years ago

Agreed. We need a way to search for GitHub repositories that meet a set of criteria. I'm struggling with the online GitHub search to return repos rather than files that meet criteria.

agitter commented 7 years ago

I found a help page that shows how to search for terms in a repository's readme. Searching manubot in:readme works well!

dhimmel commented 7 years ago

Nice! 1 false positive (greenelab/manubot); 11 true positives.

dhimmel commented 6 years ago

@agitter's search on 2018-03-26 returns 29 repositories. Some interesting manuscripts I hadn't seen before include

dhimmel commented 6 years ago

Additional instances that appear to currently be in progress are:

dhimmel commented 6 years ago

@agitter's search on 2018-10-08 returns 40 repositories. Some interesting manuscripts I hadn't seen before include

dhimmel commented 5 years ago

Manubot Catalog released

We now have a catalog with manuscripts written using Manbuot at https://manubot.org/catalog/. The catalog is defined in the https://github.com/manubot/catalog repository, with CI setup to fetch bibliographic details and trigger deployment.

Going forward, we will add new manuscripts to the catalog rather than commenting on them here.

The GitHub search will still be useful to discover new Manubot manuscripts. Here it is, modified to be ordered by "Recently updated": https://github.com/search?o=desc&q=manubot+in%3Areadme&s=updated&type=Repositories

vincerubinetti commented 5 years ago

@agitter's search on 2018-10-08 returns 40 repositories

Did we adequately capture all of these in the new catalog @dhimmel ?

dhimmel commented 5 years ago

Did we adequately capture all of these in the new catalog

No, there are some repositories that are really just stubs without much original content. There are also several repositories that are in early stages where I could see the authors wanting to wait to publicize it (OTOH it is public).

Another option would be to add a catalog field like hidden. Then we could add every repository to the catalog, while keeping some hidden (at least by default) on manubot.org/catalog. What do you think?

BTW as of 2019-07-10, I get 74 repository results.

vincerubinetti commented 5 years ago

Yeah by "adequately" I meant all the ones that are relatively finished and/or worth putting in the catalog.

I'd be down to put in all of them with a special tag and then another checkbox like "show in progress" or "show stubs".

agitter commented 5 years ago

Previously we asked authors before we added their manuscript the to example manuscript list in the Rootstock repository. Do we want to continue contacting authors before adding manuscripts to the catalog?

The manuscripts are public and easy to find via GitHub search, so we aren't leaking any information by adding them to the catalog. Nonetheless, I expect some authors would prefer to not have their early stage manuscripts advertised.

dhimmel commented 4 years ago

Do we want to continue contacting authors before adding manuscripts to the catalog?

If an author has advertised a manuscript publicly (or has posted a preprint or published the work), then I think we should add it to the catalog.

Still not sure about in-progress works.

agitter commented 4 years ago

Your proposal regarding publicly advertised, preprinted, or published work makes sense to me.

I suggest that we don't include work in progress in the catalog without the authors' permission.

agitter commented 4 years ago

I was curious about current usage statistics and ran the Manubot search today, sorting by recently updated. There are now over 20 repos updated in the last week and about 60 updated in the last month. Many of those are test manuscripts, but it's still a lot of legitimate open writing.

agitter commented 3 years ago

The simple GitHub search has false positives (readmes that discuss Manubot but are not manuscripts) and false negatives (manuscripts that use Manubot but have customized readmes or configurations). Is there a way to search the HTML metadata of the generated manuscript for a unique property like manubot_html_url_versioned to improve the quality of our automated searches? I'm not sure whether search engines index this metadata.

@vincerubinetti do you know?