Open ShuyangXu opened 3 months ago
The previous question was about what to do if there is only gene_id but no gene_name,
for instance, in NCBI's GTF file where the absence of gene_name tag leads to the inability to annotate genes.
We've adjusted it to where if gene_name is missing, gene_id will be used as a substitute.
This does not mean that it will generate a three-column file similar to Cell Ranger's feature.tsv.gz
.
Currently, our software does not support generating a three-column feature.tsv.gz
.
For reading matrix, please refer to the instructions at the bottom of the quick start guide.
Regarding the second question, to analyze RNA data directly, you can use the commanddnbc4tools rna run
. If you need to analyze only a specific step, you can use the --process flag.
Hi, MGI dev group,
I noticed you mentioned that the new version can deal with gene_id,
_Originally posted by @lishuangshuang0616 in https://github.com/MGI-tech-bioinformatics/DNBelab_C_Series_HT_scRNA-analysis-software/issues/8#issuecomment-1250766251_
yet there is no option to do so. (https://github.com/MGI-tech-bioinformatics/DNBelab_C_Series_HT_scRNA-analysis-software/blob/version2.0/doc/scRNA_para.md)
After testing,
features.tsv.gz
outputs 'gene_id' only when all 'gene_name' attributes are deleted in the GTF. Could you add an option to control gene id/name output mode? or just likecellranger
outputs them both?dnbc4tools rna mkref
creates aref.json
file with keychrmt
, while the value is set todnbc4tools rna data --chrMT
. But the help messages of below three are duplicated. I can't tell how this value would affect the result.