TRISTAN-ORF / RiboTIE

Scripts and instructions to apply RiboTIE on Ribo-seq data
MIT License
9 stars 0 forks source link

Filename too big when combining studies #11

Open NicolasProvencher opened 3 weeks ago

NicolasProvencher commented 3 weeks ago

Hi, i followed your suggestion to try and fix my bam too big problem #8 since i have a lot of small similar bam. Since your code seem to stack all the name in the output file name, the name given is now too long heres the error

Traceback (most recent call last):
  File "/opt/conda/envs/riboformer_8_3/bin/ribotie", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/opt/conda/envs/riboformer_8_3/lib/python3.12/site-packages/transcript_transformer/ribotie.py", line 190, in main
    predict(args_set, trainer=trainer, model=model, postprocess=False)
  File "/opt/conda/envs/riboformer_8_3/lib/python3.12/site-packages/transcript_transformer/transcript_transformer.py", line 302, in predict
    np.save(
  File "/opt/conda/envs/riboformer_8_3/lib/python3.12/site-packages/numpy/lib/npyio.py", line 542, in save
    file_ctx = open(file, "wb")
               ^^^^^^^^^^^^^^^^
OSError: [Errno 36] File name too long: '/home/noxatras/scratch/riboformer/straglers/46/46GSM1371443&GSM1371444&GSM1371445&GSM1371446&GSM1371447&GSM1371448&GSM1371449&GSM1371450&GSM1371451&GSM1371452&GSM1371453&GSM1371454&GSM1371455&GSM1371456&GSM1371457&GSM1371458&GSM1371459&GSM1371460&GSM1371461&GSM1371462&GSM1371463&GSM1371464&GSM1371465&GSM1371466&GSM1371467&GSM1371468&GSM1371469&GSM1371470&GSM1371471&GSM1371472&GSM1371473&GSM1371474&GSM1371475&GSM1371476&GSM1371477&GSM1371478&GSM1371479&GSM1371480&GSM1371481&GSM1371482&GSM1371483&GSM1371484&GSM1371485&GSM1371486&GSM1371487&GSM1371488&GSM1371489&GSM1371490_f1.npy'
Job finished with exit code 1 at: Thu 13 Jun 2024 05:32:23 PM EDT

i tried specifying a outprefix thinking it would overwrite the input name stacking but it doesnt seem to fix the problem since it seems like appart from the name thing my #8 seems to be fixed imma close that issue

maybe add a simple check to see how many file there is in the merge and if theres more htan 5 replace all subsequent filename but something like 'and all' or 'etc' or in the merged case if outprefix is given just output outprefix_merged_suffix.file

jdcla commented 2 weeks ago

Interesting problem.

I'll start looking into a smart way to limit the size of filenames while still ensuring being linked to the ribo_ids as defined by the configuration file.

I might not be able to get to this for a little bit. I hope you can continue working for now.

An immediate solution would be to use shorter keys in the dictionary given under ribo_paths.

NicolasProvencher commented 2 weeks ago

yeah No worries ive already replaced the file names by a single letter in the yml temporarly