Open CWYuan08 opened 1 year ago
Hi @CWYuan08,
Since I was the one referring you here.. This seems to be the way to run isONclust2: https://github.com/epi2me-labs/wf-transcriptomes (and this section in particular: https://github.com/epi2me-labs/wf-isoforms#de-novo-based-approach-experimental)
You can then run isONcorrect on the clustered output, and isONform for consensus. I have not tried the approach they listed here, but they say it is experimental, which typically means no substantial benchmarks have been done.
Best, K
Hi @ksahlin But de-novo-based-approach-experimental cannot be runned on command line mode. You can see here. So maybe this pipeline isONclust-isONcorrect-isONform is the only way ?
(Thanks @ksahlin for adding some comments here).
@Johnsonzcode the de-novo based approaches are indeed still largely experimental and so the code is not well-maintained. This project is not currently maintained and there is no one at Oxford Nanopore Technologies currently studying de-novo approaches. I dare say that @ksahlin is far more of an expert in the space than we are.
So How could I get non-redundant isoform from ONT full-length transcripts. Is there some pipeline suggested ?
(Thanks @ksahlin for adding some comments here).
@Johnsonzcode the de-novo based approaches are indeed still largely experimental and so the code is not well-maintained. This project is not currently maintained and there is no one at Oxford Nanopore Technologies currently studying de-novo approaches. I dare say that @ksahlin is far more of an expert in the space than we are.
But how could I sovle the error as mentioned? Or is there some pipeline suggested to get non-redundant isoform from ONT full-length transcriptome sequencing ?
Or is there some pipeline suggested to get non-redundant isoform from ONT full-length transcriptome sequencing ?
I can suggest running pychopper-isONclust-isONcorrect-isONform for this. The problen is that isONclust does not scale to very large datasets. This is what @CWYuan08 noticed and, hence, we ended up here looking for isONclust2 to replace isONclust as a solution. Another way is to manually batch (i.e. split) your large dataset to independent instances that isONclust can run on.
Or is there some pipeline suggested to get non-redundant isoform from ONT full-length transcriptome sequencing ?
I can suggest running pychopper-isONclust-isONcorrect-isONform for this. The problen is that isONclust does not scale to very large datasets. This is what @CWYuan08 noticed and, hence, we ended up here looking for isONclust2 to replace isONclust as a solution. Another way is to manually batch (i.e. split) your large dataset to independent instances that isONclust can run on.
This pipeline may work.
@Johnsonzcode
The https://github.com/epi2me-labs/wf-isoforms pipeline is deprecated and its functionality is folded in to wf-transcriptomes. If you wish to use the de-novo route through wf-transcriptomes, lets work to uncover the bug you are seeing with its use on the issue you have already started over there. I feel we've gone a bit off topic from @CWYuan08's original post here.
Dear @Johnsonzcode, @cjw85, @ksahlin, thank you all for the useful discussions here, this is what I would like to ask and follow too! Best, CW
Hi, I am trying to run isONclust2 first for isONcorrect, but I got this error for all the batches, one example: Loaded input batch from batches/isONbatch_9.cer: Batch number: 9 Batch range: [244492,273799] Depth: -1 Nr sequences: 29308 Nr bases: 50001212 Nr clusters: 29308 Nr nontrivial clusters: 0 Minimizers in database: 0 Created pseudo-batch for single clustering: Batch number: -9 Batch range: [244492,273799] Depth: -1 Nr sequences: 29308 Nr bases: 0 Nr clusters: 29308 Nr nontrivial clusters: 0 Minimizers in database: 0 Resetting input clusters. Clustering mode: Invalid clustering mode: 3
from running: for f in batches/isONbatch_.cer; do filename=$(basename "$f") output="clustered/${filename%.}.cer" isONclust2 cluster -v -l "$f" -o "$output" done
could you please advise what I need to fix?
Many thanks!!
Best, CW