Open xiadawei123 opened 1 year ago
It looks like some temp output files are missing. Can you check and make sure you have enough disk space for output files?
It looks like some temp output files are missing. Can you check and make sure you have enough disk space for output files?
Ok, Thank you for your prompt reply. I will increase the memory usage and try again.
It looks like some temp output files are missing. Can you check and make sure you have enough disk space for output files?
Hi, after ensuring that the server has sufficient memory (1.4T) and disk space, I reran the Hi-C alignment program from chromap. However, I am still encountering the same error. I'm not sure whether it's due to insufficient memory or some other reason. Could you provide me with some assistance? Thank you.
As I mentioned, you have too many reads and thus you need to make sure there is enough disk space for your output. The error message indicates that you don't. Memory is not related at all.
It would be great if you can check if you have enough disk for your output. If not, maybe delete some of your old files and make enough space for the output. Increasing memory is not helpful in this case.
Besides, why did you use -k 27? Did the default value work?
Besides, why did you use -k 27? Did the default value work?
I have 47T disk, I think there should be enough space, is there any other reason? Since the genome is close to 10 G, I see your previous advice to others is to increase the k setting. Of course, I also used the default k parameter, but I still got the same error.
I see. Can you run some command line to check your available disk space? I forget the exact command line. It might be "du -sh" or something else.
I see. Can you run some command line to check your available disk space? I forget the exact command line. It might be "du -sh" or something else.
Yes, I often use du-sh or df -h to check the disk space, and I reserved 47T of space for chromap Hic comparison. Thank you very much for your reply. I will run it again and finally check whether all 47T is used up
I see. Can you run some command line to check your available disk space? I forget the exact command line. It might be "du -sh" or something else.
Hi,There's plenty of disk space, so I don't think it's a disk space related problem. If you have some ideas to solve it, please let me know, thank you
This is weird. After the run, did you check if the temporary mapping files are in the output dir? You may run "ls" and see if they are there. And can you remove "--remove-pcr-duplicates" in the command line? I guess it is not very useful for hi-c? How many sequences are there in your contig.fa files?
Besides, can you show the beginning of your log?
This is weird. After the run, did you check if the temporary mapping files are in the output dir? You may run "ls" and see if they are there. And can you remove "--remove-pcr-duplicates" in the command line? I guess it is not very useful for hi-c? How many sequences are there in your contig.fa files?
Yes, I also find it very strange. The program successfully generated a large number of temporary files, each of which was approximately 1GB in size. Adding "--remove-pcr-duplicates" was because I needed to use the sam file obtained from "chromap" as an input for software YaHs for chromosome buliding, and YaHs emphasized in its instructions that the sam file needed to remove pcr-duplicates . As shown below, I have displayed some of the tempposrary files and the beginning of the log file. If you have any additional suggestions, please let me know in a timely manner. Thank you again for your response.
The error message indicates that chromap was trying to open a temp mapping file but nothing is found. Initially, I was assuming your disk space was full and temp mapping files were not able to be generated and thus cannot be opened. But it seems that this is not the case. From the log, I didn't see errors.
This is hard to debug on my side as it is hard for us to reproduce the error. If the dataset is publicly available, we can download it and try it. Otherwise, we have to change the code a little bit to let it generate more error message and ask you to try it again so that we can understand what exactly happened. Or you can use bwa-mem for your pipeline. It would be much much slower than Chromap in this case but it might work.
The error message indicates that chromap was trying to open a temp mapping file but nothing is found. Initially, I was assuming your disk space was full and temp mapping files were not able to be generated and thus cannot be opened. But it seems that this is not the case. From the log, I didn't see errors.
This is hard to debug on my side as it is hard for us to reproduce the error. If the dataset is publicly available, we can download it and try it. Otherwise, we have to change the code a little bit to let it generate more error message and ask you to try it again so that we can understand what exactly happened. Or you can use bwa-mem for your pipeline. It would be much much slower than Chromap in this case but it might work.
Thanks again for your timely reply, we have simultaneously used multiple methods for chromosome construction, including bwa mem. Yesterday, I replaced a server with better performance and tried to run chromap. If there is any problem, I will give you feedback in time.
I have the same issue and I'm sure my disk space is enough, may I inquire if there has been any progress or resolution to the matter? I appreciate your time and assistance.
Can you provide your log?
It creates a bunch of temp files and the log shows my command line is nohup chromap --preset hic -r /home/data3/hsh/genome/maguan_goat_assembly/02.genome_with_hic_hifiasm/M11/M11.hic.p_ctg.fa -x /home/data3/hsh/genome/maguan_goat_assembly/02.genome_with_hic_hifiasm/M11/M11.hic.p_ctg.index --remove-pcr-duplicates -1 /home/data3/hsh/genome/maguan_goat_assembly/03.scaffold/M11/fastq/M11_hic_merge_R1.fastq.gz -2 /home/data3/hsh/genome/maguan_goat_assembly/03.scaffold/M11/fastq/M11_hic_merge_R2.fastq.gz --SAM -o M11.hic.aligned.sam -t 100 >> M11_chromap_align.log 2>&1 &
Hello, is this issue solved? I also encounterd the similar issue, and I suppose that it may be caused by the size of temp files. The memory of my server is 1.5T and the free disk size is 15T. Did you check the tempMappingFileHandle module(temp_mapping.h),maybe it's too big to handle it.
Hello, is this issue solved? I also encounterd the similar issue, and I suppose that it may be caused by the size of temp files. The memory of my server is 1.5T and the free disk size is 15T. Did you check the tempMappingFileHandle module(temp_mapping.h),maybe it's too big to handle it.
not yet, I think is SAM output function has an error, other output option(--BED/--TagAlign) works fine.
@xiadawei123 Were you able to run Chromap as you mentioned?
If any of you are using publicly available datasets, please let me know, I can try to reproduce the error. It is impossible to just debug only with these error messages.
@xiadawei123 Hi, have you solved the problem? I have the same problem with a relatively smaller genome with the size of 4G, and the disk size in enough to run it.
Hi, I am very sorry that I failed to solve this problem and then replaced it with other alternative software. I don't have any tips to give you. Good luck to you
Xia1191273458 @.***
------------------ 原始邮件 ------------------ 发件人: "haowenz/chromap" @.>; 发送时间: 2024年4月30日(星期二) 上午9:36 @.>; @.**@.>; 主题: Re: [haowenz/chromap] An unknown error (Issue #142)
@xiadawei123 Hi, have you solved the problem? I have the same problem with a relatively smaller genome with the size of 4G, and the disk size in enough to run it.
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>
@utpala101 In the new version, Chromap will print an error message on which temp file it tries to open. This may help find some debug information. Did the same error occur on your data?
@mourisl Yes, I have the same error, and Chromap print that a temp sam file is missing, but the file is in the directory. So I don't know what's wrong with it.
@xiadawei123 Thank you so much for your timely reply! I will further look for some way.
@mourisl Yes, I have the same error, and Chromap print that a temp sam file is missing, but the file is in the directory. So I don't know what's wrong with it.
What is the file name? Is it empty?
Sorry, I have deleted the file, but it was not empty. The file name is aligned.sam.temp1019
Thank you for sharing the information. I think this may relate to the number of file handles a program can open on Linux machine, where the default is 1024 files. Considering the files for input and output, I think the 1019 temp files may reach the limit. We will add an option to specify the number of reads in each temp file so the number of temp files can be reduced.
I have updated the code that will allow temp file to hold more reads when using too many temp files, though it may cause more memory usage. The updated code is in the li_dev7 branch, could you please checkout this branch and give it a try? Thank you!
@mourisl Sorry for delay, I have tested the new code but it still went error. The error messages are as followed
Mapped all reads in 41092.57s.
Number of reads: 3393404416.
Number of mapped reads: 2658488394.
Number of uniquely mapped reads: 1961435054.
Number of reads have multi-mappings: 697053340.
Number of candidates: 580287500208.
Number of mappings: 2658488394.
Number of uni-mappings: 1961435054.
Number of multi-mappings: 697053340.
Temporary file aligned.sam.temp1019 is missing.
chromap: src/temp_mapping.h:45: void chromap::TempMappingFileHandle
My work directory had 1019 temp files which matched the error line, and the temp 1019 file size is much smaller than the former temp file. (1019 temp file is 96 MB and the former is about 960 MB). I am not sure whether my data have problems, but thank you for your work!
Thank you for the testing! It is probably still my implementation error. I'll look into it.
@mourisl Sorry for delay, I have tested the new code but it still went error. The error messages are as followed
Mapped all reads in 41092.57s. Number of reads: 3393404416. Number of mapped reads: 2658488394. Number of uniquely mapped reads: 1961435054. Number of reads have multi-mappings: 697053340. Number of candidates: 580287500208. Number of mappings: 2658488394. Number of uni-mappings: 1961435054. Number of multi-mappings: 697053340. Temporary file aligned.sam.temp1019 is missing. chromap: src/temp_mapping.h:45: void chromap::TempMappingFileHandle::InitializeTempMappingLoading(uint32_t) [with MappingRecord = chromap::SAMMapping; uint32_t = unsigned int]: Assertion `file != __null' failed. Aborted (core dumped)
My work directory had 1019 temp files which matched the error line, and the temp 1019 file size is much smaller than the former temp file. (1019 temp file is 96 MB and the former is about 960 MB). I am not sure whether my data have problems, but thank you for your work!
You can try the command ”ulimit -n 4096“ in the node of your cluster.
Sorry for the delayed reply @utpala101 . The branch's code should be able to handle 20 billion reads. I have updated the code in the li_dev7 branch that should allow more reads per temp file. The branch also adds a warning message whenever the temp file volume is increased for the debugging purpose. If you are still working on the data, could you please give it a try?
Hi
I am using the software chromap developed by you for map HiC reads, but an error occurred during the alignment. I hope to get your help. The following is my code and error, thank you
conda create -n chromap_yahs -c bioconda -c conda-forge chromap samtools yahs samtools assembly-stats openjdk samtools faidx contig.fa chromap -i -r contig.fa -o index -w 14 -k 27 nohup chromap --preset hic -r contig.fa -x index --remove-pcr-duplicates -1 R1.fastq -2 R2.fastq --SAM -o aligned.sam -t 90 &