Open JannesSP opened 2 years ago
Hi @JannesSP,
Looks like there is an issue with the format of the hdf5 file, specifically that there is not a chromosome attribute saved for the gene for some reason. Can you please describe how you ran yanocomp prep
& yanocomp gmmtest
. I noticed that you are only testing one gene, is this a viral RNA?
BW Matt
Hey @mparker2,
exactly, I am testing it on viral RNA - so only one "chromosome" to which I aligned/mapped my reads and ran nanopolish eventalign on. I executed yanocomp prep with the command like this: yanocomp prep -e nanopolish_eventalign.tsv -h yanocomp_prep.hdf5 -p 12 And for yanocomp gmmtest I currently only have one control and one test sample.
Kind regards, Jannes
Hey @mparker2
Found the problem: The reference Fasta file I downloaded from a database had the '/' character in the header, which shows up as the contig in the nanopolish eventalign .tsv. The h5py API is interpreting the '/' as a group separation character when creating the hdf5 (datasets, groups etc.) in the yanocomp prep. This is why multiple unwanted groups were created in the hdf5 file and the gmmtest could not find the 'chrom' attribute. Maybe you can somehow catch this error, replace this character in the code or tell the user to check their Fasta headers for this character (gene_id).
Kind regards, Jannes
aha! well done for figuring that out. Sorry I didn't get any time to help... I should really sanitise the strings used to create all attributes better. WIll mark this as a bug to get around to! Many thanks for reporting it
Matt
Hi @mparker2 Because I got the same error message as @JannesSP, I checked the header of my fasta file whether the header include "/", making the error.
Here is the first line of my fasta file. There is no "/' at the very top of the file.
However, I found that in the middle of the file, I found the lines that include "/", Would it make the problem for the error? If so, how can I modify my fasta file to generate yanocomp gmmtest mode without the error?
I am looking forward to hearing from you. Thank you!
2022-10-07 12:37:26,831 WARNING Default min depth set to 6 to match window size 3 2022-10-07 12:37:26,836 INFO Running gmmtest in 3-comp GMM (uniform outliers) mode with 1 control datasets and 1 treatment datasets 2022-10-07 12:37:26,839 INFO 1 genes to be processed on 1 workers Traceback (most recent call last): File "/home/yi98suv/anaconda3/envs/yanocomp/bin/yanocomp", line 8, in
sys.exit(cli())
File "/home/yi98suv/anaconda3/envs/yanocomp/lib/python3.9/site-packages/click/core.py", line 1130, in call
return self.main(args, kwargs)
File "/home/yi98suv/anaconda3/envs/yanocomp/lib/python3.9/site-packages/click/core.py", line 1055, in main
rv = self.invoke(ctx)
File "/home/yi98suv/anaconda3/envs/yanocomp/lib/python3.9/site-packages/click/core.py", line 1657, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/yi98suv/anaconda3/envs/yanocomp/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, ctx.params)
File "/home/yi98suv/anaconda3/envs/yanocomp/lib/python3.9/site-packages/click/core.py", line 760, in invoke
return __callback(args, kwargs)
File "/home/yi98suv/anaconda3/envs/yanocomp/lib/python3.9/site-packages/yanocomp/opts.py", line 16, in _make_dataclass
return cmd(dynamic_dataclass(cls_name, bases=bases, cli_kwargs))
File "/home/yi98suv/anaconda3/envs/yanocomp/lib/python3.9/site-packages/yanocomp/gmmtest.py", line 333, in gmm_test
res, sm_preds = parallel_test(opts)
File "/home/yi98suv/anaconda3/envs/yanocomp/lib/python3.9/site-packages/yanocomp/gmmtest.py", line 210, in parallel_test
res, sm_preds = test_chunk(
File "/home/yi98suv/anaconda3/envs/yanocomp/lib/python3.9/site-packages/yanocomp/gmmtest.py", line 155, in test_chunk
chrom, strand = load_gene_attrs(gene_id, cntrl_h5)
File "/home/yi98suv/anaconda3/envs/yanocomp/lib/python3.9/site-packages/yanocomp/io.py", line 291, in load_gene_attrs
chrom = g.attrs['chrom']
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "/home/yi98suv/anaconda3/envs/yanocomp/lib/python3.9/site-packages/h5py/_hl/attrs.py", line 60, in getitem
attr = h5a.open(self._id, self._e(name))
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5a.pyx", line 77, in h5py.h5a.open
KeyError: "Can't open attribute (can't locate attribute: 'chrom')"
Any idea how to fix this or where the error is coming from?