Open seankross opened 7 years ago
@seankross what a great discussion to be inspired by :wink:
I guess you might also need some sort of hunspell
thing with a dictionary made of R functions to help correcting the typos that'd probably be created by the OCR?
Oh wait I guess the typos thing is what you mean by "trained" tesseract models sorry.
This sounds like a valuable complement to .
This is an interesting idea!
Definitely interested in this idea, but particularly the output side:
Right now my idea for taking screenshots is webshot::appshot()
This is a great!
So what if we injected some metadata into the image that contained a link to a gist with the output? A corresponding fetch
method could take a tweet URL and automatically return the code, using the image metadata to find the corresponding gist.
With this facility the need for OCR would hopefully phase out over time. Although I think the OCR idea is worthwhile in the first instance.
A LoFi version of this could just return a gist link and an image, leaving the user to pair them in a tweet. But, then you have the link eating into your original witty remark. I dunno how acceptable that would be. In this community probably not very.
Could we bot this? I'm thinking something opt-in, where a user would have to set up something akin to IFTTT approval, and it could have a trigger tag. Even as I write this now it's beginning to sound too convoluted, but the end idea would be that there would be a reply triggered with the gist link, as not to cut into witty-remark real estate.
Twitter, it seems, strips embedded metadata out of image files after it extracts location data: https://support.twitter.com/articles/20156423
Twitter supports attaching metadata to an image in the tweet itself, though I'm not sure what/how much metadata is supported. It can be consumed by the reader. I imagine they don't want this to serve as a place to hide large payloads of information, but putting a gist link in alt_text
is probably OK:
Using this would probably require having the R package not just generate the image, but post the tweet as well.
stegasaur, by the incomparable @richfitz, will encode text or arbitrary R objects into images via steganography. This may be the way to go if the data survives any image optimization twitter may perform.
Another idea: As much as I don't like QR codes in principle, maybe there should be a QR or other kind of barcode that's embedded in the image and we could store the gist url there, although this wouldn't be necessary if we can read and write the tweet metadata.
Also @MilesMcBain we could add the gist link to the image itself so a human could read it, or they could use tweet_to_gist([tweet url])
to get the gist url. The image itself would look something like:
# https://gist.github.com/hadley/37c8078eb9d46b5dac7e
awesome_stack_overflow_data %>%
dplyr_function() %>%
tidyr_function() %>%
ggplot3(aes = c(language, awesomeness)) +
geom_oculus(fov=Inf)
How about riffing off of/extending: https://github.com/hrbrmstr/hrbraddins/blob/master/R/tweet-share.r
+1 for @noamross's suggestion for using stegasaur to transmit code via twitter images!
I realize that PNGs uploaded to twitter seems to be converted to JPGs. Not sure whether the steganographic encoding will survive that.
some algos can gen steg data that will at least partially survive but it's unlikely source code will.
a good chunk of providers use https://github.com/cloudflare/jpegtran (or a derivative) and one of the design goals is to beat malware which has a side-effect of beating steg in most cases.
On Tue, Mar 28, 2017 at 3:33 PM, Noam Ross notifications@github.com wrote:
I realize that PNGs uploaded to twitter seems to be converted to JPGs. Not sure whether the steganographic encoding will survive that.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ropensci/unconf17/issues/17#issuecomment-289879930, or mute the thread https://github.com/notifications/unsubscribe-auth/AAfHtkoZw2EeW-U3yqrC8HXr6rXBNG00ks5rqWCQgaJpZM4Mg2DJ .
I lean toward the options resulting in an image + gist link. The steganography and OCR approaches sound fun but I suspect they will be less accessible to many people who are interested in the code. That's worth giving up some tweet characters to the gist URL, IMO. @jennybc I agree this feels natural to include in the reprex package as a "tw" option in the existing reprex::reprex
function.
💯 to @sfirke steganography feels very much like cool and fun but far less accessible to cram information in a less accessible spot. A gist, + short URL + screenshot seems best
@hrbrmstr's "trick" for getting the screenshot is great but will require LaTeX, because PDF, right? I wonder if there's a way around that?
I always worry abt phantomjs working consistently and also being abused by malware on windows On Tue, Mar 28, 2017 at 17:50 Jennifer (Jenny) Bryan < notifications@github.com> wrote:
Maybe render to html and use webshot https://github.com/wch/webshot?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ropensci/unconf17/issues/17#issuecomment-289915625, or mute the thread https://github.com/notifications/unsubscribe-auth/AAfHtlfI42yXhJzYck6jek6tj944H7m_ks5rqYCFgaJpZM4Mg2DJ .
I note that Twitter's alt_text
field is designed for, and used by, people with visual impairments, so we wouldn't want to hijack it for other purposes. But including text such as, "Image of R code, full code at https://gist.github...." would make the screenshot more accessible and would be a great use of the field whether or not the gist link is included in the tweet separately.
I think you can avoid LaTeX or PhantomJS/webshot solutions entirely by just placing the text onto a blank image with the R graphics device. Some careful tweaking would be needed to make it look good and be right-sized for arbitrary code, and you'd want to pick a good sans
font that is accessible to R on most systems, but it avoids any round-tripping.
imagemagick supports text annotations directly on an image and I'm 99.999%
sure (didn't test it) that magick::image_annotate()
implements that part
of the imagemagick API. i had this as a mental note to try vs my
slacker-use of knit-pdf-to-image.
On Wed, Mar 29, 2017 at 8:11 AM, Noam Ross notifications@github.com wrote:
I think you can avoid LaTeX or PhantomJS/webshot solutions entirely by just placing the text onto a blank image with the R graphics device. Some careful tweaking would be needed to make it look good and be right-sized for arbitrary code, and you'd want to pick a good sans font that is accessible to R on most systems, but it avoids any round-tripping.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ropensci/unconf17/issues/17#issuecomment-290070507, or mute the thread https://github.com/notifications/unsubscribe-auth/AAfHtrz_wLdn33f50FSQ2wrTHH2xlffeks5rqkpZgaJpZM4Mg2DJ .
One can do it with just png()
, plot.new()
, and text()
, no? No magic required.
After seeing another tweet of pic of code => optimize discussion yesterday (https://twitter.com/tonyfischetti/status/866457187140370433), wanted to reiterate how valuable I think this could be (which might just involve spreading the word re @hrbrmstr's tweet-share script in hrbraddins, if we think it's already been covered).
For readers curious about the result of this discussion, the pkg is here: https://github.com/ropenscilabs/codefinch (googling took me to this thread, and probably will again the next time I forget the pkg name, so this comment is a kind of redirecting bookmark)
For potentially future reference, I'm currently quite infatuated with carbon: https://carbon.now.sh/
Oooooooh carbon. Looks awesome!
Aye!
And of course I'd love asciinema to come to windows so we could do terminal gifs and then tweet them on windows https://asciinema.org/
I realised it was going to be a few minutes work to add carbon support to gistfo
, since they support gists. So you can now send the active RStudio tab to carbon.now.sh: https://github.com/MilesMcBain/gistfo
I've been inspired by this discussion on Twitter:
These are the kinds of tweets I'm referring to in this discussion:
I actually like seeing screenshots of R code in tweets, but then of course I wish I had the code! You could try to extract the code with tesseract but doing that every time for every tweet can be messy.
What do you think about building a package that takes an R file and creates a screenshot of the code (with options that optimize the screenshot for Twitter), and then in that package we include trained tesseract models for extracting code from those screenshots. There could even be a function that takes a tweet and gives you the code like
tweet_to_code("https://twitter.com/drob/status/840232496860135424", file = "drob.R")
. Right now my idea for taking screenshots iswebshot::appshot()
.