Open jkobject opened 4 months ago
Name | Link |
---|---|
Latest commit | fdd2c4ce445e1631b53940b9bb5f2d6a79aef8ad |
Latest deploy log | https://app.netlify.com/sites/bionty-base-091d/deploys/65d4bd1442ee3500098d418e |
Deploy Preview | https://deploy-preview-540--bionty-base-091d.netlify.app |
Preview on mobile | Toggle QR Code...Use your smartphone camera to open QR code link. |
To edit notification comments on pull requests, go to your Netlify site configuration.
Thank you for this PR @jkobject ! Just that it won't have an effect yet for users who want customized parsing, since we pull the reference parquet files from the s3.
To enable user to define their own relationship types, we need to either enable a local mode for the reference files (the risk is that users won't be able to get new references automatically when it's updated); or we figure something out specifically for lamindb.
@Zethson @falexwolf how do you think we should go about this?
Related issue: https://github.com/laminlabs/bionty-base/issues/529
I am not sure I understand what you mean but on my end doing: bt.Tissue.public().pronto().to_df(include_rel ="part_of")
seems to work
I am not sure I understand what you mean but on my end doing: bt.Tissue.public().pronto().to_df(include_rel ="part_of") seems to work
Yes, this works, but it won't work for canonical accessors such as .from_public()
or .from_values()
or public.df()
.
Does accessing the df
with pronto().to_df(...)
alone already meet what you need?
I might be able to rewrite my code but right now it is true that I am using bt.Tissue.filter().df(include=["parents__ontology_id"])
and it wouldn't work for this..
@jkobject thank you for the PR and sorry for chipping in so late. I think I accidentally deleted the Github notification email.
So we could of course upload monstrous .parquet
files to our S3 that include all relationships, but this will probably greatly increase the download and parsing times. Maybe there'd be a way to provide partial reading support for a subset of columns of the parquet
file then if we store things appropriately.
I'd probably suggest a solution that involves lamindb?
makes it much slower if one chooses something like "part_of" for certain kinds of ontologies (90mns for "part_of" on all tissues)