mims-harvard / TDC

Therapeutics Commons (TDC-2): Multimodal Foundation for Therapeutic Science
https://tdcommons.ai
MIT License
1k stars 174 forks source link

SingleCellPrediction should instead be labled Perturb-based prediction task for scperturb datasets #237

Closed amva13 closed 6 months ago

amva13 commented 7 months ago

*SingleCellPrediction should instead be labled Perturb-based prediction task for scperturb datasets

abearab commented 7 months ago

scperturb contains datasets with very different type of perturbations, e.g. genetic perturbation (which can be knock-out, inhibition, or activation in DNA or RNA level using diverse platforms such as Cas9, Cas12, or Cas13), drug perturbation, and etc. Also, the perturbation can be in higher orders, e.g. inhibition of two genes at the same time. So the "prediction" tasks can go in many directions in my opinion.

amva13 commented 7 months ago

It is with this understanding that the loader is kept to this level of detail rather than segmenting into single instance, multi instance etc.

This decision was made in consultation with fellow ML researchers at our lab.

Feel free to provide a suggestion for a better api as a feature request or create a pull request.

Sent from Proton Mail for iOS

On Fri, Apr 5, 2024 at 6:11 PM, Abolfazl (Abe) @.***(mailto:On Fri, Apr 5, 2024 at 6:11 PM, Abolfazl (Abe) < wrote:

scperturb contains datasets with very different type of perturbations, e.g. genetic perturbation (which can be knock-out, inhibition, or activation in DNA or RNA level using diverse platforms such as Cas9, Cas12, or Cas13), drug perturbation, and etc. Also, the perturbation can be in higher orders, e.g. inhibition of two genes at the same time. So the "prediction" tasks can go in many directions in my opinion.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

abearab commented 7 months ago

It is with this understanding that the loader is kept to this level of detail rather than segmenting into single instance, multi instance etc. This decision was made in consultation with fellow ML researchers at our lab.

Cool. As a user, I strongly agree that ML tasks with perturbation datasets could be seen as a new "problem" rather than being part of current problem definitions.

Feel free to provide a suggestion for a better api as a feature request or create a pull request.

Will do, thanks.

amva13 commented 6 months ago

closed with https://github.com/mims-harvard/TDC/pull/252 thanks @kexinhuang12345 !