bartongroup / RATS

Relative Abundance of Transcripts: An R package for the detection of Differential Transcript isoform Usage.
MIT License
32 stars 1 forks source link

Issue with importing salmon files #69

Closed Mekilich closed 2 years ago

Mekilich commented 2 years ago

Hi, I want to carry out some analysis with your package, but i have a problem importing salmon files. fish4rodents successfully converts the files to H5 format but then gives the below error.

mydata <- fish4rodents(A_paths= samples_A, B_paths= samples_B, annot= myannot, scaleto=100000000)

Error in lgth$A : $ operator is invalid for atomic vectors

How do i solve this? what could be the reason?

fruce-ki commented 2 years ago

Hello,The reason is likely that something that was expected to be a 2-dimensional table turned out to be a one-dimensional vector instead or possibly even empty.How many replicates do you have per condition and how many bootstrap rounds does your data have? Is the h5 file size believable or could it be that it is empty and the conversion was in fact not successful? Do you get the same error if you quantify with Kallisto instead of Salmon?On 22 Jan 2022 21:12, Mekilich @.***> wrote: Hi, I want to carry out some analysis with your package, but i have a problem importing salmon files. fish4rodents successfully converts the files to H5 format but then gives the below error. mydata <- fish4rodents(A_paths= samples_A, B_paths= samples_B, annot= myannot, scaleto=100000000) Error in lgth$A : $ operator is invalid for atomic vectors How do i solve this? what could be the reason?

—Reply to this email directly, view it on GitHub, or unsubscribe.Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you are subscribed to this thread.Message ID: @.***>

Mekilich commented 2 years ago

Hi, Thanks for answering. i have 3 replicates per condition. quant.sf files are 2 mb while H5 files are 500kb. so i guess the conversion was successful. i havent tried with Kallisto, but i will, and will let you know if that works. i used same salmon files for isoformswitchanalyzer package and they worked fine there.

fruce-ki commented 2 years ago

Ok, let me know. I'm sure the salmon files are fine. I am just trying to isolate if it is a problem in the conversion to h5 or in the parsing of h5.Also, you are using using the latest version of RATs, right?On 23 Jan 2022 12:50, Mekilich @.***> wrote: Hi, Thanks for answering. i have 3 replicates per condition. quant.sf files are 2 mb while H5 files are 500kb. so i guess the conversion was successful. i havent tried with Kallisto, but i will, and will let you know if that works. i used same salmon files for isoformswitchanalyzer package and they worked fine there.

—Reply to this email directly, view it on GitHub, or unsubscribe.Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you commented.Message ID: @.***>

Mekilich commented 2 years ago

Hi, i have the kallisto outputs now. but fish4rodets doesnt work with them since they have no quant.sf files. and i didnt see any other code for importing kallisto files in vignette. am i missing something? and i installed RATs yesterday, so its the latest version. btw when i run this code below i get a dataframe with one column containing transcript ids.

ids <- as.data.frame( h5read("col-0_LH_1/abundance.h5", "/aux/ids") )

fruce-ki commented 2 years ago

I don't remember off the top of my head exactly what kallisto and salmon output and how RATs dealt with them as I haven't done transcript quant in a while. But I may have use for RATs again soon, so I will try to generate some quant files and see if I can reproduce your issue. Could be that kallisto and salmon have made changes to their outputs with newer versions.While I figure this out, if you are in a hurry, maybe you will have better luck with kallisto 0.44 as that was the one available at the time I think.Given that the variable is called ids and knowing my coding style, without the code in front of me at the moment, I would guess that this line does exactly what it is supposed to do. Don't worry about that.On 23 Jan 2022 21:24, Mekilich @.***> wrote: Hi, i have the kallisto outputs now. but fish4rodets doesnt work with them since they have no quant.sf files. and i didnt see any other code for importing kallisto files in vignette. am i missing something? and i installed RATs yesterday, so its the latest version. btw when i run this code below i get a dataframe with one column containing transcript ids. ids <- as.data.frame( h5read("col-0_LH_1/abundance.h5", "/aux/ids") )

—Reply to this email directly, view it on GitHub, or unsubscribe.Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you commented.Message ID: @.***>

On 23 Jan 2022 21:24, Mekilich @.***> wrote: Hi, i have the kallisto outputs now. but fish4rodets doesnt work with them since they have no quant.sf files. and i didnt see any other code for importing kallisto files in vignette. am i missing something? and i installed RATs yesterday, so its the latest version. btw when i run this code below i get a dataframe with one column containing transcript ids. ids <- as.data.frame( h5read("col-0_LH_1/abundance.h5", "/aux/ids") )

—Reply to this email directly, view it on GitHub, or unsubscribe.Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you commented.Message ID: @.***>

On 23 Jan 2022 21:24, Mekilich @.***> wrote: Hi, i have the kallisto outputs now. but fish4rodets doesnt work with them since they have no quant.sf files. and i didnt see any other code for importing kallisto files in vignette. am i missing something? and i installed RATs yesterday, so its the latest version. btw when i run this code below i get a dataframe with one column containing transcript ids. ids <- as.data.frame( h5read("col-0_LH_1/abundance.h5", "/aux/ids") )

—Reply to this email directly, view it on GitHub, or unsubscribe.Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you commented.Message ID: @.***>

On 23 Jan 2022 21:24, Mekilich @.***> wrote: Hi, i have the kallisto outputs now. but fish4rodets doesnt work with them since they have no quant.sf files. and i didnt see any other code for importing kallisto files in vignette. am i missing something? and i installed RATs yesterday, so its the latest version. btw when i run this code below i get a dataframe with one column containing transcript ids. ids <- as.data.frame( h5read("col-0_LH_1/abundance.h5", "/aux/ids") )

—Reply to this email directly, view it on GitHub, or unsubscribe.Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you commented.Message ID: @.***>

On 23 Jan 2022 21:24, Mekilich @.***> wrote: Hi, i have the kallisto outputs now. but fish4rodets doesnt work with them since they have no quant.sf files. and i didnt see any other code for importing kallisto files in vignette. am i missing something? and i installed RATs yesterday, so its the latest version. btw when i run this code below i get a dataframe with one column containing transcript ids. ids <- as.data.frame( h5read("col-0_LH_1/abundance.h5", "/aux/ids") )

—Reply to this email directly, view it on GitHub, or unsubscribe.Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you commented.Message ID: @.***>

Mekilich commented 2 years ago

Yeah so i think there is no problem with h5 files. since that code works. it looks like fish4rodents is looking for quant.sf file in kallisto output. but kallisto doesnt produce such an output. it produces 3 files h5, tsv and json. with salmon, fish4rodents finds the quant.sf files and converts them to h5 but then it gives the error "Error in lgth$A : $ operator is invalid for atomic vectors". kallisto 0.44 outputs same files as newest kallisto but content might be a bit different. fish4rodents gives below error. and deletes the abundance.h5 file from kallisto output.

Error in fread(file.path(fish_dir, "quant.sf")) : File 'col-0_LH_1/quant.sf' does not exist or is non-readable. getwd()=='D:/Transcriptomics/Kallisto'

Mekilich commented 2 years ago

I just tried with kallisto 0.44 output. same error as before.

fruce-ki commented 2 years ago

Just an update that I have not forgotten about this. Setting up to work with Kallisto for the first time in years was a bumpy road full of annotation problems. But I am getting there and should be able to deal with the actual issue here soon.

Mekilich commented 2 years ago

Thanks for not forgetting. waiting to see if you will have the same issue.

fruce-ki commented 2 years ago

Right, there is definitely a syntax bug in fish4rodents() that escaped the unit tests.

fruce-ki commented 2 years ago

Ok, I've pushed a syntax fix to the development branch. I suggest you install from there for now, as I need to do some cleanup before making a release.

Please note that I have already disabled wasabi, which is the third-party package that provides conversion from Salmon to Kallisto. Presumably it worked for you, but I got really obscure weird errors out of it just now and it hasn't been available for auto-install for my versions of R in years, so it is negatively impacting the import function. It is a single-line conversion command that users of Salmon and RATs should be able to execute on their own. So don't delete the .h5 files you already got!

Mekilich commented 2 years ago

Hi, i installed RATS from development branch but now i get the following error when i run fish4rodents.

Error in rhdf5::h5read(file.path(fil, "abundance.h5"), "/bootstrap") : Object '/bootstrap' does not exist in this HDF5 file.

fruce-ki commented 2 years ago

Hi,Did you run Salmon with bootstrapping enabled?I think I never ran Salmon or Kallisto without bootstraps, but it also means I never tried to import such datasets, so there is no fallback for bootstraps not being available. I should probably implement that.If you do not have bootstraps, perhaps the best thing to do for now is to try the data.table input method instead of fish4rodents. Salmon output is some form of a flat text table if I remember correctly, so you should be able to import the data into R data.frames or data.tables, format them to the specifications in the vignette and use those as input instead.You will not be able to use the RATs bootstraps without providing Salmon bootstraps. But hopefully the rest will work.On 9 Feb 2022 14:48, Mekilich @.***> wrote: Hi, i installed RATS from development branch but now i get the following error when i run fish4rodents. Error in rhdf5::h5read(file.path(fil, "abundance.h5"), "/bootstrap") : Object '/bootstrap' does not exist in this HDF5 file.

—Reply to this email directly, view it on GitHub, or unsubscribe.Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you commented.Message ID: @.***>

Mekilich commented 2 years ago

i have run kallisto wih bootstraps allowed and now it worked so all looks fine for now. Thanks for all the help.

fruce-ki commented 2 years ago

Good to know that this works! Thank you for your time and understanding in working through this.

fruce-ki commented 2 years ago

As of commit https://github.com/bartongroup/RATS/commit/e27c7393aa9369709fefeae78349d8552eac2024 , it is now possible to import non-bootstrapped estimates from the Kallisto format.

Currently still in the development branch.