Open alexyfyf opened 11 months ago
Hi thank you for reporting this error. I have pushed a new release now that should fix the fasta format output. The idea behind the header for each isoform is as follows: The first number in your case '0' denotes which cluster the isoform was generated from. The second number (in your case '105') gives the batch number in the cluster (we divide each cluster in batches of 1000 reads each), while the third number contains an individual id so we do not get any double isoforms for the same id. I will address the problem with the empty intermediate files in the next days. Best, Alex
Hi Alex,
Thank you for your reply. So my understanding is that your transcript identifications are derived from gene clusters from isonclust, so the cluster id, ie the first number, could be used as gene id surrogates? Am I correct?
Thank you. Alex
---- Replied Message ---- | From | Alexander J @.> | | Date | 12/04/2023 21:17 | | To | aljpetri/isONform @.> | | Cc | Feng @.>, Author @.> | | Subject | Re: [aljpetri/isONform] output need to be reformated (Issue #15) |
Hi thank you for reporting this error. I have pushed a new release now that should fix the fasta format output. The idea behind the header for each isoform is as follows: The first number in your case '0' denotes which cluster the isoform was generated from. The second number (in your case '105') gives the batch number in the cluster (we divide each cluster in batches of 1000 reads each), while the third number contains an individual id so we do not get any double isoforms for the same id. I will address the problem with the empty intermediate files in the next days. Best, Alex
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>
Hi Alex, the clusters generated by isONclust represent gene families and not genes themselves and therefore it would be dangerous using them as gene surrogates. Best, Alex
Hi team,
I found your isonform output fasta file is not a standard format with
>
line as header. And there are lots of empty files in the isonform fodler such asAlso, can you explain what the numbers in the header line means, for example this one
Thank you so much.
Alex