markrobinsonuzh / cytofWorkflow

MIT License
14 stars 3 forks source link

cofactor for arcsinh transformation when a sce object is assembled #18

Closed sunhuaiyu closed 4 years ago

sunhuaiyu commented 4 years ago

Dear Dr. Robinson:

I am writing to seek answer for a question I had when using cytofWorkflow to analyze flow cytometry data.

When trying out different cofactor values for prepData(), I was not satisfied with the histograms of some marker (single peak, too narrow). I wonder if I can use different cofactors for individual markers. Do you think this would be a 'safe' approach for the data transformation in cytofWorkflow?

Thank you for your attention. Best regards,

Huaiyu Sun

markrobinsonuzh commented 4 years ago

@sunhuaiyu Thanks for the message.

Yes, you can use different co-factors for each marker and in fact, you can send a vector of cofactors to prepData() .. in the docs, it says:


cofactor numeric cofactor(s) to use for optional arcsinh-transformation when transform = TRUE; single value or a vector with channels as names.

Cheers, Mark

sunhuaiyu commented 4 years ago

Thank you!

On Sep 3, 2020, at 11:51 PM, markrobinsonuzh notifications@github.com<mailto:notifications@github.com> wrote:

@sunhuaiyuhttps://github.com/sunhuaiyu Thanks for the message.

Yes, you can use different co-factors for each marker and in fact, you can send a vector of cofactors to prepData() .. in the docs, it says:


cofactor numeric cofactor(s) to use for optional arcsinh-transformation when transform = TRUE; single value or a vector with channels as names.

Cheers, Mark

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/markrobinsonuzh/cytofWorkflow/issues/18#issuecomment-686951763, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACFS2TFQ6OYZJ6HQOLGFQ73SECE6VANCNFSM4QT63WIQ.

sunhuaiyu commented 4 years ago

Hi Mark,

Thanks again for responding to my question regarding cofactors.

I am trying to process a large dataset over 200 samples and >25 million cells. However, the clustering and dimensionality reduction functions failed to process and crashed the R environment (running on one EC2 with 128G memory). I wonder if you have any suggestion/recommendation on using CyTOF workflow in processing data at this size.

Best regards,

Huaiyu

On Sep 3, 2020, at 11:51 PM, markrobinsonuzh notifications@github.com<mailto:notifications@github.com> wrote:

@sunhuaiyuhttps://github.com/sunhuaiyu Thanks for the message.

Yes, you can use different co-factors for each marker and in fact, you can send a vector of cofactors to prepData() .. in the docs, it says:


cofactor numeric cofactor(s) to use for optional arcsinh-transformation when transform = TRUE; single value or a vector with channels as names.

Cheers, Mark

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/markrobinsonuzh/cytofWorkflow/issues/18#issuecomment-686951763, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACFS2TFQ6OYZJ6HQOLGFQ73SECE6VANCNFSM4QT63WIQ.

markrobinsonuzh commented 4 years ago

This is a good question and it's a little more difficult with CyTOF (than scRNA-seq, for example), because you do not have a natural 0, which allows sparse matrices to be used. On the flip side, the dimension of CyTOF should be much smaller ..

A couple thoughts:

sunhuaiyu commented 4 years ago

Thank you! I will definitely try down sampling first. Really appreciate a detailed reply. —Huaiyu

On Sep 9, 2020, at 4:01 AM, markrobinsonuzh notifications@github.com<mailto:notifications@github.com> wrote:

This is a good question and it's a little more difficult with CyTOF (than scRNA-seq, for example), because you do not have a natural 0, which allows sparse matrices to be used. On the flip side, the dimension of CyTOF should be much smaller ..

A couple thoughts:

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/markrobinsonuzh/cytofWorkflow/issues/18#issuecomment-689488290, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACFS2TFUUB7GS6UULS4O4VDSE5OBTANCNFSM4QT63WIQ.

sunhuaiyu commented 4 years ago

Hi Mark,

Is there any option to parallelize the tSNE computing in CyTOF workflow?

Also as separate question, is FlowSOM100 the only clustering scheme available in your workflow?

Thank you, Huaiyu

On Sep 9, 2020, at 4:01 AM, markrobinsonuzh notifications@github.com<mailto:notifications@github.com> wrote:

This is a good question and it's a little more difficult with CyTOF (than scRNA-seq, for example), because you do not have a natural 0, which allows sparse matrices to be used. On the flip side, the dimension of CyTOF should be much smaller ..

A couple thoughts:

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/markrobinsonuzh/cytofWorkflow/issues/18#issuecomment-689488290, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACFS2TFUUB7GS6UULS4O4VDSE5OBTANCNFSM4QT63WIQ.

HelenaLC commented 4 years ago

Re parallelisation of t-SNE (and other dimensionality reduction methods): CATALYST's runDR calls scater's runX functions (e.g., runTSNE for t-SNE). Thus, any parameters accepted by these functions can be put as ... in runDR and will be passed to the corresponding runX method. For t-SNE, parallelisation can be achieved via argument BPPARAM (see ?scater::runTSNE for details).

Re other clustering algorithms: While cluster wraps around FlowSOM only, any clustering method can be in principle apply and incorporated in CATALYST's infrastructure. Without wanting to go into details here, please check section 8.2 Using other clustering algorithms in the vignette here.

sunhuaiyu commented 4 years ago

Thanks!

On Sep 17, 2020, at 12:24 AM, Helena L. Crowell notifications@github.com<mailto:notifications@github.com> wrote:

Re parallelisation of t-SNE (and other dimensionality reduction methods): CATALYST's runDR calls scater's runX functions (e.g., runTSNE for t-SNE). Thus, any parameters accepted by these functions can be put as ... in runDR and will be passed to the corresponding runX method. For t-SNE, parallelisation can be achieved via argument BPPARAM (see ?scater::runTSNE for details).

Re other clustering algorithms: While cluster wraps around FlowSOM only, any clustering method can be in principle apply and incorporated in CATALYST's infrastructure. Without wanting to go into details here, please check section 8.2 Using other clustering algorithms in the vignette herehttp://bioconductor.org/packages/release/bioc/vignettes/CATALYST/inst/doc/differential.html.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/markrobinsonuzh/cytofWorkflow/issues/18#issuecomment-694031382, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACFS2TAZMQGW62K2XUIS3HLSGG2SLANCNFSM4QT63WIQ.