laminlabs / lamin-usecases

Use cases.
https://docs.lamin.ai
Apache License 2.0
5 stars 0 forks source link

birds eye review #37

Closed Zethson closed 1 year ago

Zethson commented 1 year ago

Comments on https://lamin.ai/docs/birds-eye

Generally, I already really like it.

upload and analyze the GWS data

filepath = ln.dev.datasets.schmidt22_crispra_gws_IFNG(ln.settings.storage) file = ln.File(filepath, description="Raw data of schmidt22 crispra GWS") file.save() ln.setup.login("testuser2") transform = ln.Transform(name="GWS CRIPSRa analysis", type="notebook") ln.track(transform)

file_wgs = ln.File.filter(key="schmidt22-crispra-gws-IFNG.csv").one() df = file_wgs.load().set_index("id") hits_df = df[df["pos|fdr"] < 0.01].copy() file_hits = ln.File(hits_df, description="hits from schmidt22 crispra GWS") file_hits.save()


I'd go for

app upload

ln.setup.login("testuser1") transform = ln.Transform(name="Upload GWS CRISPRa result", type="app") ln.track(transform) filepath = ln.dev.datasets.schmidt22_crispra_gws_IFNG(ln.settings.storage) file = ln.File(filepath, description="Raw data of schmidt22 crispra GWS") file.save()

upload and analyze the GWS data

ln.setup.login("testuser2") transform = ln.Transform(name="GWS CRIPSRa analysis", type="notebook") ln.track(transform) file_wgs = ln.File.filter(key="schmidt22-crispra-gws-IFNG.csv").one() df = file_wgs.load().set_index("id") hits_df = df[df["pos|fdr"] < 0.01].copy() file_hits = ln.File(hits_df, description="hits from schmidt22 crispra GWS") file_hits.save()

- [x] "Let’s see how the data lineage of this looks:" -> Let's see what the data lineage of this looks like
- [x] Add a new line above scanpy & move it 1 line down because it concerns the next section in the hidden code cell

screen_hits = file_hits.load() import scanpy as sc


- [ ] Can we show the plot? Would just be fun to look at after there have been so many steps in the data pipeline
![image](https://github.com/laminlabs/lamin-usecases/assets/21954664/ca12be47-d59a-4ae2-bca4-7f259ddc633c)
falexwolf commented 1 year ago

One comment on this:

image

I'm generally agreed! 0.5 sentences on what data lineage is would be awesome. If everything stays at 2 - 3 sentences, we can also eliminate the dropdown!

One remark: Let's please try to use the same sentences/words/definition that we use in the tutorial.

The idea is that most people will read the readme/landing & tutorial and then come here when they want to know more about data lineage. These people both will have a rough idea of data lineage already and they'll also have seen LaminDB's take on it in the tutorial and on the readme/landing.

A few people might come here without first reading the tutorial. For them the 0.5 sentences on what data lineage is is gold.

But in any case, language should be ultra-consistent between different sections of the docs.

Soonish, I'd also suggest to add both data lineage and provenance to our glossary.

falexwolf commented 1 year ago

Really 100% agreed on everything else! @sunnyosun & I are now trying to come up with a video that actually walks through this using LaminDB & LaminApp.

falexwolf commented 1 year ago

If you prettify anything, please start with this one!