mskilab-org / chromunity

Discovery of communities in Pore-C concatemers
11 stars 4 forks source link

could not find function "chromunity" #10

Open farhan-lab opened 1 year ago

farhan-lab commented 1 year ago

Hi Chromunity developer,

Thanks for this excellent tool. I am trying to learn chromunity and doing a test run for Calling "Sliding window" chromunity on the demo GRange following the tutorial. I have been able to install chromunity and all required packages as mentioned in the installation instruction. When I reach the step: this_sliding_chrom = chromunity(concatemers = this_gr_training, resolution = 5e4, window.size = 2e6, mc.cores = 1)

I get an error Error in chromunity(concatemers = this_gr_training, resolution = 50000, : could not find function "chromunity"

Could you please guide me what I am doing wrong?

Thank you, Kind regards, Farhan

jameson-orvis commented 1 year ago

Hello! The tutorial was a bit outdated, in the current version of Chromunity the chromunity() function no longer exists and is instead replaced with two different functions sliding_window_chromunity() and re_chromunity(). The tutorial should now reflect these changes. Sorry for that and don't hesitate to reach out in case you run into any additional issues!

farhan-lab commented 1 year ago

Hi,

Thanks for your explanation. I noted the new updated function, and will try this on the test data. May I ask if you could please provide a stepwise guidelines or tutorial for the updated "Sliding window" Chromunity?

In addition, I would like to request your thoughts on two issues for running chromunity on my datasets:

  1. To convert my parquet files into GRanges I get the following error:

Error in rbindlist(parq.list, fill = TRUE) : Item 1 of input is not a data.frame, data.table or list In addition: Warning message: In mclapply(X, function(...) { : all scheduled cores encountered errors in user code

  1. I have a specific question related to my project. We have generated CAPTURE Pore-C data to study interactions of an locus of our interest which is around 800 kb. Do you think that "Sliding window" approach will work to call high order interactions, considering we are not looking Genom-wide or Chromosome-wide interactions? If yes, could you please recommend if I could use custom based settings for locus specific resolution of high order interactions among enhancers/promoters in this locus?

Many thanks, Kind regards, Farhan

jameson-orvis commented 1 year ago

The updated tutorial should be live now at http://mskilab.com/chromunity/tutorial.html

  1. From this error I am not sure. Could you try this snippet on testing data?

example_dir1 = system.file("extdata/", package = 'chromunity') this.gr = parquet2gr(example_dir1, mc.cores = 2)

If this runs and your data still doesn't load there may be some formatting issue with your parquet file?

  1. 800kb is smaller than a single default sliding window of 2e6 bp, so this should be runnable within a single window. The sliding_window_chromunity function takes a "windows" argument which allows you to specify a particular window to run Chromunity on, for which you should be able to simply pass the 800kb window you are analyzing as a GRange.
farhan-lab commented 1 year ago

Thank you for the updated tutorial and insights.

I tried your provided snippet on testing data and it seems to work with some warnings, but it did convert the parquet to Granges.

But in your updated tutorial when I tried, it gave an error.

Error in parquet2gr(example_dir1, mc.cores = 2) : No valid files files with suffix pore_c.parquet found.

It gave the same error when I was trying to convert my data parquet files to Granges, and I renamed all my parquet chunks with suffic pore_c.parquet and then I got error: Error in rbindlist(parq.list, fill = TRUE) : Item 1 of input is not a data.frame, data.table or list In addition: Warning message: In mclapply(X, function(...) { : all scheduled cores encountered errors in user code

Thanks

jameson-orvis commented 1 year ago

Hmm, the error likely just happening with the call to read_parquet() in the pbmclapply loop. I am assuming it's attempting to read a filename that doesn't exist or something like that. I would try running the function line by line by running debug(parquet2gr) and running the parquet2gr function on your data, and checking to make sure the paths stored in all.paths do correspond to your data. Otherwise maybe a deeper file formatting issue?

farhan-lab commented 1 year ago

Thank you so much. Yes, here I have attached my data in this link here. The folder contain multiple chunks of parquets generated by PoreC snakemake pipeline. Best regards, Farhan

jameson-orvis commented 11 months ago

Hi, sorry for taking a bit to reply. I believe the error is coming from the fact that your parquet files include non-standard header names, you seem to have two separate alignments with "align1" and "align2" prefixed before each respective column header in your parquet files. To load your data, you should call the parquet2gr() function with the argument col_names = c('align1_start', 'align1_end', 'align1_chrom', 'read_name') or any other fields from your parquet files you are interested in. Furthermore, before line 81 in the chromunity.R file, you should also insert the following line:

colnames(parq.dt) = gsub("align1_", "", colnames(parq.dt))

This will strip the "align1_" prefix from your data table headers and ensure you can convert the data to a GRange.

I hope this helps!

farhan-lab commented 11 months ago

Thanks for your response. I will follow your recommended settings and further update in my next comment.

farhan-lab commented 11 months ago

Hi Again,

Sorry for my late update on this issue. I am trying your solutions to convert my parquet data in Granges, as you suggested, I should insert a line in the chromunity.R file, but I am not able to located this file in my installed library folder.

chromunity_screenshot

Could you suggest how do I make this edit?

Best, Farhan