LabTranslationalArchitectomics / riboWaltz

optimization of ribosome P-site positioning in ribosome profiling data
MIT License
46 stars 12 forks source link

Short read inputs and questions with bamtolist & codon_usage_psite #81

Closed chenghongdeng closed 2 months ago

chenghongdeng commented 6 months ago

Hi Fabio,

First, thanks for developing this fantastic tools. I have been use to this tool for analysis several datasets and get amazing result.

However, when I am trying to use RiboWaltz to analysis a subset of my data, I encounter some problems. This subset contains reads are shorter than normal RPFs (less than 25nt). I use the identical scripts that works for whole set of my data. And I got couple errors.

For bamtolist function, my code is: reads_list <- bamtolist(bamfolder = "RiboWaltz/bamfiles", annotation = annotation_dt) I got the error: Error in (function (x) : attempt to apply non-function But after this error, it still able to read the bamfiles and progress for later functions. And after reads all my bam files. I got a warning message: call dbDisconnect() when finished working with a connection

Then I calculate the p site offset by using function psite. I also experience the same problem with #79. The piste is off.

And when I trying to calculate the codon_usage_psite. My command is:

example_cu_barplot <- codon_usage_psite(reads_psite_list, annotation_dt,
                                      sample = input_samples,
                                        multisamples = "average",
                                        plot_style = "facet",
                                        fastapath = "hg38.RefS
eq.reduced.mRNA.bed12.cds.fa",
                                        fasta_genome = FALSE,
                                        frequency_normalization = FALSE)

The error message I got: _Error in h(simpleError(msg, call)) : error in evaluating the argument 'x' in selecting a method for function 'subseq': subscript contains invalid names Calls: codon_usage_psite ... normalizeSingleBracketSubscript -> NSBS -> NSBS -> .subscripterror Execution halted

I hope I can get some help about how to debug this error.

Thank you Chenghong

fabiolauria commented 6 months ago

Hi Chenghong. First of all, do I understand correctly that all these issues arise only when you use the subset with reads shorter than 25 nts? In any case, I wouldn't focus too much on errors and warnings which result in the expected output. Especially because I see no reason as to why they should pop up in this context and not in others. Plus, the warning doesn't seem to me anything related with the package itself.

As for the last error, if you managed to get the P-site by overcoming the #79 -like issue, I think it's something similar to other issues about incoherent transcript names across GTF and FASTA. Can you try and follow some of the steps suggested in #72 and let me know how it works?

Best Fabio

chenghongdeng commented 6 months ago

Hi Fabio,

Thanks for you responses. Yes, all these issues arise only I try with subset data which contains reads shorter than 25nt.

I will try to follow up with #72.

Thank you Chenghong

fabiolauria commented 6 months ago

Hi there, I'm still thinking about the errors arising only with short reads. Another question for you: which is the minimum length of the reads in your sample?

Thank you Fabio

chenghongdeng commented 6 months ago

Hi Fabio,

The minimum length is 15nt. My reads are region between 15nt to 25nt.

Best, Chenghong

fabiolauria commented 4 months ago

Hi Chenghong, did you have the chance to try and follow the steps suggested in https://github.com/LabTranslationalArchitectomics/riboWaltz/issues/72 to solve your error?

Is there anything I can do to help?

Best Fabio