vanheeringen-lab / ANANSE

Prediction of key transcription factors in cell fate determination using enhancer networks. See full ANANSE documentation for detailed installation instructions and usage examples.
http://anansepy.readthedocs.io
MIT License
77 stars 16 forks source link

bug? fatel: bad revision Head #12

Closed JGASmits closed 4 years ago

JGASmits commented 4 years ago

When I try to run Ananse binding using the following code:

ananse binding \ -r /home/jsmits/ananse/KC_enh_int.bed \ -o /home/jsmits/ananse/binding_KC.txt \ -a /home/jsmits/tools/GRCh38.p13/GRCh38.p13.annotation.bed \ -g /home/jsmits/tools/GRCh38.p13/GRCh38.p13.fa \ -p /home/jsmits/git/ANANSE/data/gimme.vertebrate.v5.1.pfm

I get a error:

fatal: bad revision 'HEAD' Traceback (most recent call last): File "/home/jsmits/anaconda3/envs/ananse/bin/ananse", line 286, in args.func(args) File "/home/jsmits/anaconda3/envs/ananse/lib/python3.7/site-packages/ananse/commands/binding.py", line 23, in binding a.run_binding(args.fin_rpkm, args.outfile) File "/home/jsmits/anaconda3/envs/ananse/lib/python3.7/site-packages/ananse/binding.py", line 229, in run_binding filter_bed = self.clear_peak(peak_bed) File "/home/jsmits/anaconda3/envs/ananse/lib/python3.7/site-packages/ananse/binding.py", line 114, in clear_peak peaks = self.set_peak_size(peaks, 200) File "/home/jsmits/anaconda3/envs/ananse/lib/python3.7/site-packages/ananse/binding.py", line 82, in set_peak_size for peak in peaks: File "pybedtools/cbedtools.pyx", line 792, in pybedtools.cbedtools.IntervalIterator.next File "pybedtools/cbedtools.pyx", line 656, in pybedtools.cbedtools.create_interval_from_list IndexError: list index out of range

That is not really clearly telling me where stuff goes awry. What is going wrong here? :)

JGASmits commented 4 years ago

Both updating the ananse to 0.1.2, and removing the header fixed this step for me. (either allone did not work).

simonvh commented 4 years ago

@JGASmits Which file had a header? Should this be something that should be added to the documentation, if so, can you add it to #10 as an item?

JGASmits commented 4 years ago

I initially had a header in the : "ananse binding -r input bed file", it is correct in the example file in the documentation though. Do BED files normally have a header? Because if not it might not need further documentation.

JGASmits commented 4 years ago

After performing the Motif scan it still crashes:

JGASmits commented 4 years ago

ananse binding -r /home/jsmits/ananse/KC_enh_int.bed -o /home/jsmits/ananse/binding_KC.txt -a /home/jsmits/tools/GRCh38.p13/GRCh38.p13.annotation.bed -g /home/jsmits/tools/GRCh38.p13/GRCh38.p13.fa -p /home/jsmits/gimmemotifs/data/motif_databases/gimme.vertebrate.v3.1.pwm fatal: bad revision 'HEAD' 2020-04-08 16:44:15.056 | INFO | ananse.binding:run_binding:236 - Peak initializtion 2020-04-08 16:46:46.674 | INFO | ananse.binding:run_binding:240 - Motif scan 0%| | 0/51026 [00:00<?, ?it/s]2020-04-08 16:47:48,839 - INFO - using 10000 sequences 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 51026/51026 [1:52:03<00:00, 7.59it/s] 2020-04-08 18:39:48.604 | INFO | ananse.binding:run_binding:244 - Predicting TF binding sites Traceback (most recent call last): File "/home/jsmits/anaconda3/envs/ananse/bin/ananse", line 308, in args.func(args) File "/home/jsmits/anaconda3/envs/ananse/lib/python3.7/site-packages/ananse/commands/binding.py", line 26, in binding a.run_binding(args.fin_rpkm, args.outfile) File "/home/jsmits/anaconda3/envs/ananse/lib/python3.7/site-packages/ananse/binding.py", line 247, in run_binding table = self.get_binding_score(pfm, peak) File "/home/jsmits/anaconda3/envs/ananse/lib/python3.7/site-packages/ananse/binding.py", line 219, in get_binding_score ft = dd.read_csv(self.factortable, sep="\t") File "/home/jsmits/anaconda3/envs/ananse/lib/python3.7/site-packages/dask/dataframe/io/csv.py", line 578, in read kwargs File "/home/jsmits/anaconda3/envs/ananse/lib/python3.7/site-packages/dask/dataframe/io/csv.py", line 444, in read_pandas head = reader(BytesIO(b_sample), kwargs) File "/home/jsmits/anaconda3/envs/ananse/lib/python3.7/site-packages/pandas/io/parsers.py", line 676, in parser_f return _read(filepath_or_buffer, kwds) File "/home/jsmits/anaconda3/envs/ananse/lib/python3.7/site-packages/pandas/io/parsers.py", line 454, in _read data = parser.read(nrows) File "/home/jsmits/anaconda3/envs/ananse/lib/python3.7/site-packages/pandas/io/parsers.py", line 1133, in read ret = self._engine.read(nrows) File "/home/jsmits/anaconda3/envs/ananse/lib/python3.7/site-packages/pandas/io/parsers.py", line 2037, in read data = self._reader.read(nrows) File "pandas/_libs/parsers.pyx", line 860, in pandas._libs.parsers.TextReader.read File "pandas/_libs/parsers.pyx", line 875, in pandas._libs.parsers.TextReader._read_low_memory File "pandas/_libs/parsers.pyx", line 929, in pandas._libs.parsers.TextReader._read_rows File "pandas/_libs/parsers.pyx", line 916, in pandas._libs.parsers.TextReader._tokenize_rows File "pandas/_libs/parsers.pyx", line 2071, in pandas._libs.parsers.raise_parser_error pandas.errors.ParserError: Error tokenizing data. C error: Expected 2 fields in line 3, saw 4

JGASmits commented 4 years ago

head of the -r bed file: 1 777679 779679 42 1 958280 960280 70 1 959551 961551 57 1 965347 967347 140 1 966097 968097 140 1 967163 969163 43 1 999166 1001166 64 1 999827 1001827 64 1 1012535 1014535 50 1 1018614 1020614 96

JGASmits commented 4 years ago

head of the annotation file: 1 11868 14409 ENST00000456328 0 + 14409 14409 0 3 359,109,1189, 0,744,1352, 1 12009 13670 ENST00000450305 0 + 13670 13670 0 6 48,49,85,78,154,218, 0,169,603,965,1211,1443, 1 14403 29570 ENST00000488147 0 - 29570 29570 0 11 98,34,152,159,198,136,137,147,99,154,37, 0,601,1392,2203,2454,2829,3202,3511,3864,10334,15130, 1 17368 17436 ENST00000619216 0 - 17436 17436 0 1 68, 0, 1 29553 31097 ENST00000473358 0 + 31097 31097 0 3 486,104,122, 0,1010,1422, 1 30266 31109 ENST00000469289 0 + 31109 31109 0 2 401,134, 0,709, 1 30365 30503 ENST00000607096 0 + 30503 30503 0 1 138, 0, 1 34553 36081 ENST00000417324 0 - 36081 36081 0 3 621,205,361, 0,723,1167, 1 35244 36073 ENST00000461467 0 - 36073 36073 0 2 237,353, 0,476, 1 52472 53312 ENST00000606857 0 + 53312 53312 0 1 840, 0,

bioqxu commented 4 years ago

-p, --motifs The input Motif file. This is an example Motif file in vertebrate. if provided there should also be a motif2factors.txt file and a factortable.txt file in the same folder. This is an example of motif2factors file. This is an example of factortable file.

Please read the detail of files in the readme file.

simonvh commented 4 years ago

@JGASmits BED files do not have a header

@qxuchn Yeah, this is something that should change though. We should use the files that come by default with GimmeMotifs as that will allow for many different motif databases. See also #4

bioqxu commented 4 years ago

@simonvh I will fix it in the late version. But please provide this file now @JGASmits

JGASmits commented 4 years ago

aah, i changed the location of the motif database since I ran it last time. It would however be helpfull if when you run the command and it cant find a file it will give an error: 'motif database file missing' instead of starting to run and only crashing the moment a file is not found.

JGASmits commented 4 years ago

Ill change the file location and try agian! :)

JGASmits commented 4 years ago

It ran ananse binding succesfull! It still gives the : "fatal: bad revision 'HEAD'" everytime I run it, but it succesfully generated the output file.

ananse binding -r /home/jsmits/ananse/KC_enh_int.bed -o /home/jsmits/ananse/binding_KC.txt -a /home/jsmits/tools/GRCh38.p13/GRCh38.p13.annotation.bed -g /home/jsmits/tools/GRCh38.p13/GRCh38.p13.fa -p /home/jsmits/ANANSE/data/gimme.vertebrate.v5.1.pfm fatal: bad revision 'HEAD' 2020-04-09 09:43:07.869 | INFO | ananse.binding:run_binding:236 - Peak initializtion 2020-04-09 09:45:05.389 | INFO | ananse.binding:run_binding:240 - Motif scan 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 51026/51026 [5:28:45<00:00, 2.59it/s] 2020-04-09 15:14:49.403 | INFO | ananse.binding:run_binding:244 - Predicting TF binding sites 2020-04-09 15:19:51.151 | INFO | ananse.binding:run_binding:249 - Save results