nteract / scrapbook

A library for recording and reading data in notebooks.
https://nteract-scrapbook.readthedocs.io
BSD 3-Clause "New" or "Revised" License
281 stars 26 forks source link

NES Feedback Overhaul #29

Closed MSeal closed 5 years ago

MSeal commented 5 years ago

This is a massive overhaul PR which takes in feedback from a lot of sources which I bundled together to save many potentially conflicting PRs being made. But the changes break down into these larger topics:

TODO after merge:

todo[bot] commented 5 years ago

default to 'display' encoder when encoder is None and object is a display object type?

https://github.com/nteract/scrapbook/blob/fb3dcf52e59fc57540b49380e5df948aaa407329/scrapbook/api.py#L79-L84


This comment was generated by todo based on a TODO comment in fb3dcf52e59fc57540b49380e5df948aaa407329 in #29. cc @MSeal.
todo[bot] commented 5 years ago

set encoder information to save as encoding

https://github.com/nteract/scrapbook/blob/fb3dcf52e59fc57540b49380e5df948aaa407329/scrapbook/encoders.py#L147-L152


This comment was generated by todo based on a TODO comment in fb3dcf52e59fc57540b49380e5df948aaa407329 in #29. cc @MSeal.
todo[bot] commented 5 years ago

read from encoder information to load as encoding

https://github.com/nteract/scrapbook/blob/fb3dcf52e59fc57540b49380e5df948aaa407329/scrapbook/encoders.py#L154-L159


This comment was generated by todo based on a TODO comment in fb3dcf52e59fc57540b49380e5df948aaa407329 in #29. cc @MSeal.
codecov[bot] commented 5 years ago

Codecov Report

Merging #29 into master will increase coverage by 0.16%. The diff coverage is 95.02%.

@@            Coverage Diff             @@
##           master      #29      +/-   ##
==========================================
+ Coverage   95.33%   95.49%   +0.16%     
==========================================
  Files           7        7              
  Lines         257      355      +98     
==========================================
+ Hits          245      339      +94     
- Misses         12       16       +4
MSeal commented 5 years ago

Code coverage dropped slight despite more tests. I think this was due to more error message code lines. It's still well above 90% so I'm not worried.

MSeal commented 5 years ago

Solves: https://github.com/nteract/scrapbook/issues/27 https://github.com/nteract/scrapbook/issues/28 https://github.com/nteract/scrapbook/issues/19 (duplicate)

willingc commented 5 years ago

Whoa! Nice @MSeal. Will give a closer review this evening.

MSeal commented 5 years ago

Yes we'll need python 2 support in scrapbook for papermill deprecation. I did some stats gathering and 1/5th of papermill downloads are to python 2 environments still. I think dropping python 2 support for both scrapbook and papermill at end-of-year when python 2 official support drops makes the most sense.

willingc commented 5 years ago

I think dropping python 2 support for both scrapbook and papermill at end-of-year when python 2 official support drops makes the most sense.

Let's add that tentative plan to the README and/or setup.py to signal that intent.

rgbkrk commented 5 years ago

The only piece I wanted to dig into check on was that the media types matched well with the notebook format specification and limitations.

Right now this looks good for what's there as the types look like this:

application/scrapbook.scrap+json: JSONObject
application/scrapbook.scrap+unicode: string

For other encoders, they can only fit this pattern:

application/scrapbook.scrap+[name]: string

This is due to the fact that anything that doesn't end in +json is forced to be a string in the notebook format.

This means that if you ever want an encoder to use json under the hood like this:

{
  "ref": "s3://..."
}

You'll be stuck writing it as an encoded string within the json notebook, because the notebook format has to (for legacy reasons) treat anything that doesn't end in +json as a string.

Perhaps then we'd have a new ref structure (?). An alternative to this is to always nest it under application/scrapbook.scrap+json but where a scrap has a particular format like:

"application/scrapbook.scrap+json": {
  "storage": "json",
  "data": { "a": 2 }
}
"application/scrapbook.scrap+json": {
  "storage": "arrow-by-ref",
  "data": { "ref": "s3://.../" }
}
rgbkrk commented 5 years ago

Note: my comment above shouldn't be taken to block this PR or effort, only to raise up some possibility of inflexibility that you might want to be aware of before making a call on this.

MSeal commented 5 years ago

Wait...

You'll be stuck writing it as an encoded string within the json notebook, because the notebook format has to (for legacy reasons) treat anything that doesn't end in +json as a string.

Seriously?!?! uuuhhhggg. Ok this kind of makes sense and I didn't think about it much before, so ty for pointing that out.

Ok what about if we did "application/scrapbook.scrap.<encoder_type>+json"? @rgbkrk

rgbkrk commented 5 years ago

Ok what about if we did "application/scrapbook.scrap.<encoder_type>+json"?

Neat, I like that! Then each of those can choose whether they're represented with json or text (unless it's always going to be a structured object).

MSeal commented 5 years ago

For now we'll just support the structured doc whose "data" key has JSON or text -- but the leaves us open to other top level formats.