PacificBiosciences / pbbioconda

PacBio Secondary Analysis Tools on Bioconda. Contains list of PacBio packages available via conda.
BSD 3-Clause Clear License
243 stars 44 forks source link

IsoPhase on MAS-Iso-seq data for haplotyping #690

Closed camelest closed 1 month ago

camelest commented 2 months ago

Hi, I'm wondering whether we could apply IsoPhase (https://github.com/Magdoll/cDNA_Cupcake/wiki/IsoPhase:-Haplotyping-using-Iso-Seq-data) or any other methods for haplotyping using MAS-Iso-seq data. We would like to assign maternal/paternal origin to each mapped bam read.

I'm following https://isoseq.how for the preprocess. For the inputs required for IsoPhase,

  1. Reference genome, in fasta format
  2. Full-length (CCS/HiFi) reads, in fastq format (best if filtered for accuracy!, see below)
  3. Annotation GFF file using Iso-Seq data, where each isoform has ID format PB.X.Y
  4. Association file linking individual full-length reads to isoforms PB.X.Y

could we use

  1. dedup.fasta (output from isoseq groupdedup)
  2. collased.sorted.gff (output from isoseq collapse)
  3. annotated.info.csv (output from pigeon make-seurat) ?

Also, is there any modification needed to run IsoPhase on MAS-Iso-seq data?

Although we are aware that IsoPhase is no longer officially supported, we would really appreciate it if we could get some help.

Best,

Raku

armintoepfer commented 1 month ago

This is not a mechanical software problem. Please ask support@pacb.com

Magdoll commented 1 month ago

Hi @camelest , IsoPhase is no longer supported. There are a few other new tools that might serve you better:

IsoLaser: https://www.biorxiv.org/content/10.1101/2024.06.14.599101v1

LongCallR (might only have variant calling, no phasing): https://github.com/huangnengCSU/longcallR

Good luck! -Liz