malariagen / ag1000g-phase3-data-paper

Other
1 stars 2 forks source link

Q90/91 Population sampling tables #20

Closed cclarkson closed 3 years ago

cclarkson commented 3 years ago

Q90/Q91

Notebook builds two csv tables:

N.B.

review-notebook-app[bot] commented 3 years ago

Check out this pull request on  ReviewNB

Review Jupyter notebook visual diffs & provide feedback on notebooks.


Powered by ReviewNB

leehart commented 3 years ago

where/how you run the md to deploy/visualise it?

I still haven't completely nailed local dev, but I reckon it's something like:

(There are some notes from AM in the Discourse thread, including the options of disabling PDF build, and triggering local builds automatically using inotifywait, and serving the web-page locally using Python's http.server.)

Deployment to the hosted https://malariagen.github.io/ag1000g-phase3-data-paper space will happen automatically whenever there is a push to master.

When a PR is made, or whenever there is a push to an existing PR, deployment to the hosted space should happen automatically too, except only to the corresponding subdir, e.g. https://malariagen.github.io/ag1000g-phase3-data-paper/PR/12/

the links to the deployed example tables 404.

Deployed PRs are removed every time that the master branch is updated. I'll update that PR by merging in the latest from master, and check that it gets redeployed, and let you know.

leehart commented 3 years ago

@cclarkson Those links to the deployed PR #12 work now, and will probably 404 again at some point, as explained above.

cclarkson commented 3 years ago

@hardingnj, how would you describe what a "sample set" is? Is it simply a submission set or has any splitting/joining been done?

cclarkson commented 3 years ago

@hardingnj , @leehart - first draft is ready for comments. I'm finding pantable a bit clunky after latex, do we know how to get italics in the species column headers (md syntax doesn't work) and adjust row height?

@leehart, how did you link pdf/html from an unmerged branch (so you guys don't have to run my code)?

leehart commented 3 years ago

@leehart, how did you link pdf/html from an unmerged branch (so you guys don't have to run my code)?

@cclarkson That should be available automatically after the CI passes at URLs named after the PR number, i.e.:

Looks like column width problems on that first table. Have you tried table-width rather than column-centric width? Some demos at https://malariagen.github.io/manubot-test/ (That might be a good place for the how-tos you mentioned.)

02.delete-me.md has the column heading Bowling Scores in italics, using Markdown syntax, so it's disappointing that doesn't work - I guess it must be the wrapper/context.

| *Bowling Scores* | Jane          | John          | Alice         | Bob           |
|:-----------------|:-------------:|:-------------:|:-------------:|:-------------:|
| Game 1 | 150 | 187 | 210 | 105 |
| Game 2 |  98 | 202 | 197 | 102 |
| Game 3 | 123 | 180 | 238 | 134 |

I haven't adjusted row heights yet, but I'll look into it. I expect we could use a CSS style, in theory. [Why do we need to adjust row heights?]

leehart commented 3 years ago

I expect pantable's markdown key needs to be set to true. Looks like it defaults to None: https://github.com/ickc/pantable

The example in their README doesn't actually show Markdown being used in the column headers, but does show it being used everywhere else!

It seems a bit wrong to be changing the CSV just to apply visual styling, so I'm wondering if there's a different way, maybe using CSS.

cclarkson commented 3 years ago

Column width is weird, it looked fine for me locally. Any ideas why that might happen?

leehart commented 3 years ago

Column width is weird, it looked fine for me locally. Any ideas why that might happen?

It looks like the table isn't as wide as it should be, maybe because all of the specified column widths don't add up to 1? I think they are meant to be fractions of the full width. Did you try simply removing them? Shall I try my hand? I also think 7pt might be too small, looking at it.

cclarkson commented 3 years ago

Hi @hardingnj, @leehart - I think this is looking okay now. I'd like to be able to split column header alignment (i.e. central) from column alignment (i.e text : left, numerical : right) but I don't think that's possible. I have left the table as central aligned as having differently aligned headers, following "text : left, numerical : right", looked weird.

@hardingnj - did you want to explain "sex unknown" in the legend or in the text somewhere?

leehart commented 3 years ago

Hi @hardingnj, @leehart - I think this is looking okay now. I'd like to be able to split column header alignment (i.e. central) from column alignment (i.e text : left, numerical : right) but I don't think that's possible. I have left the table as central aligned as having differently aligned headers, following "text : left, numerical : right", looked weird.

I hear you. Unfortunately the first thing I think of when I see the table is that the numbers should be right aligned!

Ideally there would be a way to separate the the column headers and their styling from the data, to give us more flexibility. I'd like to investigate that, so that we can have more control over styling the column headers, then the rest of the table is purely data, and we already know we can align the data values. Although we don't yet have other things like value formatting.

It's worth considering writing our own Pantable-like Pandoc filter, if we can't get the features we need, or requesting them. https://github.com/ickc/pantable

So, you're saying right-aligned values with centre-aligned headers looks weirder than centre-aligned everything? Oh - I've just realised that the number of decimal places in those values is inconsistent [they're not zero-padded], which would mess up alignment. [But this only affects lat & lon, so I'm still curious to see why right-aligning the other values looks bad.] I was thinking we could delay fixing the column-header alignments later, since we also wanted to solve column-header styling too.

cclarkson commented 3 years ago

@hardingnj - unfortunately the reason the column names are so short is to fit the table onto a page without making the text too small for publication. This is why I describe the columns in the legend.

hardingnj commented 3 years ago

I think given that the most likely location for this is in the supplement- I think my preference is to go with the more descriptive column names? If it does make it to the main pdf we can always drop the hybrid count columns. Let's discuss next Monday.

cclarkson commented 3 years ago

@hardingnj, it would be good to close this PR. I propose building the table as a csv using the long headings you suggested, that way the columns will not get broken by manubot. If we, at a later date, want this in the paper, we can re-visit and make a version that fits on a page.

What do you think?

hardingnj commented 3 years ago

Thanks Chris for following up on this, .csv with long headings sounds good to me.

cclarkson commented 3 years ago

superseded by https://github.com/malariagen/ag1000g-phase3-data-paper/pull/46